[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#4111) upgrading from 2.2.19 -> 2.3.11: crash loading cn



Awesome.  Thank you!

At 11:58 PM +0000 10/27/05, hyc@symas.com wrote:
>OK, fixed in HEAD, see slapd/config.c rev 1.423
>
>Howard Chu wrote:
>>  Forest Hill wrote:
>>>  At this point, ct[i] looks like this:
>>>
>>>  (gdb) p ct[i]
>>>  $7 = {
>>>    name = 0x17ed80 "loglevel",
>>>    what = 0x17e680 "level",
>>>    min_args = 2,
>>>    max_args = 0,
>>>    length = 0,
>>>    arg_type = 2147483648,
>>>    arg_item = 0x7060,
>>>    attribute = 0x17ed8c "( OLcfgGlAt:28 NAME 'olcLogLevel' SYNTAX
>>>  OMsDirectoryString )",
>>>    ad = 0x2f8f1a0,
>>>    notify = 0x0
>>>  }
>>>
>>>
>>>  At 4:08 PM -0700 10/27/05, Howard Chu wrote:
>>>>  c->rvalue_vals should not be NULL of course. What is ct[i] ?
>>>>  config_get_vals zeroes out c->rvalue_* and then retrieves the
>>>>  values. attr_merge should only be called if config_get_vals
>>>>  succeeded. That's in the logic of config_build_attrs in bconfig.c.
>>>>
>>>>  Forest Hill wrote:
>>>>>  Sorry for the delay. I had to deal with some other stuff for a bit.
>>>>>  So, I turned off compiler optimizations this time to get a clearer
>>>>>  view, and here's the immediate problem.  On line 78 of
>>>>>  servers/slapd/values.c:
>>>>>
>>>>>          for ( ; !BER_BVISNULL( addvals ); v2++, addvals++ ) {
>>>>>  BER_BVISNULL is defined as
>>>>>
>>>>>          #define BER_BVISNULL(bv)        ((bv)->bv_val == NULL)
>>>>>
>>>>>  And in the code in question, addvals is in fact NULL, so we're
>>>>>  deref'ing NULL.
>>>>>
>>>>>  Following this up the stack, in attr_merge_normalize(), on line
>>>>>  261, attr_merge() is being called with vals = NULL and nvals == NULL.
>>>>>
>>>>>  Up in frame 3 in config_build_attrs() we do:
>>>>>
>>>>>          attr_merge_normalize(e, ct[i].ad,
>>>>>                  c->rvalue_vals, NULL);
>>>>>
>>>>>  in this case c->rvalue_vals is NULL, which is what's getting this
>>>>>  all started, I guess.
>>>>>
>>>>>  I guess the simple hack fix would be to be to change BER_BVISNULL
>>>>>  to be
>>>>>
>>>>>  #define BER_BVISNULL(bv)        ( (bv) == NULL || (bv)->bv_val ==
>>>>>  NULL)
>>>>>
>>>>>  to avoid the NULL deref.  However this macro had the same
>>>>>  definition in 2.2.19, so I'm guessing that this is more a question
>>>>>  of improper us of said macro, rather than needing to change the
>>>>>  macro itself.
>>>>>
>>>>>  Looking at frame 5 in config_back_db_open(), the ConfigArgs (named
>>>>>  just "c") that eventually gets passed in there is never fully
>>>>>  initialized (it's created on the stack). Specifically, the
>>>>>  rvalue_vals field is never initialized. When I'm running on a fully
>>>>>  un-optimized build it does end up being NULL, but I see no reason
>>>>>  that this would necessarily be the case when it's fully optimized.
>>>>>
>>>>>  So, one possible peripheral problem could be solved by initializing
>>>>>  c to zeros on the stack.
>>>>>
>>>>>  However, that doesn't answer the question I have, which is, how is
>>>>>  c.ravalue_vals ever going to get set to anything before it's passed
>>>>>  into config_build_attrs and eventually has c.ravalue_vals dereffed
>>>>>  when it's NULL?
>>>>>
>>>>>
>>>>>  At 6:42 PM -0700 10/25/05, Forest Hill wrote:
>>>>>>  I'm going to stick some diagnostic code in there and see if I can
>>>>>>  figure out where it's getting munged.
>>>>>>
>>>>>>  At 6:34 PM -0700 10/25/05, Forest Hill wrote:
>>>>>>>  Will get you the full printout in a second when my machine
>>>>>>>  unhangs itself *sigh*
>>>>>>>
>>>>>>>  But it looks like i in frame 3 is way messed up. it's got a value
>>>>>>>  of 12678480. So, I'm guessing that something else is writing over
>>>>>>>  that value somewhere...
>>>>>>>
>>>>>>>  At 6:10 PM -0700 10/25/05, Howard Chu wrote:
>>>>>>>>  Forest Hill wrote:
>>>>>>>>>  Yeah, absolutely. any vars in particular that you'd lke?
>>>>>>>>
>>>>>>>>  In frame 3, the call to attr_merge_normalize, print i, ct[i],
>>>>>>>>  and *c. That should be a good start. Then in frame 1 print a,
>>>>>>>>  *a, vals.
>>>>>>>>>
>>>>>>>>>  At 5:59 PM -0700 10/25/05, Howard Chu wrote:
>>>>>>>>>>  Well, I see where, but I don't see why. Any chance you can run
>  >>>>>>>>> this under a debugger and print some of the local variables
>>>>>>>>>>  near the crash?
>>>>>>>>>>
>>>>>>>>>>  forest@apple.com wrote:
>>>>>>>>>>> Whoops. Yeah. Here's one from a build with symbols in it:
>>>>>>>>>>>
>>>>>>>>>>> Date/Time:      2005-10-25 17:13:01.109 -0700
>>>>>>>>>>> OS Version:     10.4.3 (Build 8F46)
>>>>>>>>>>> Report Version: 3
>>>>>>>>>>>
>>>>>>>>>>> Command: slapd
>>>>>>>>>>> Path:    ./slapd
>>>>>>>>>>> Parent:  sh [4799]
>>>>>>>>>>>
>>>>>>>>>>> Version: ??? (???)
>>>>>>>>>>>
>>>>>>>>>>> PID:    9690
>>>>>>>>>>> Thread: 0
>>>>>>>>>>>
>>>>>>>>>>> Exception:  EXC_BAD_ACCESS (0x0001)
>>>>>>>>>>> Codes:      KERN_PROTECTION_FAILURE (0x0002) at 0x00000004
>>>>>>>>>>>
>>>>>>>>>>> Thread 0 Crashed:
>>>>>>>>>>> 0   slapd    0x0002bc70 value_add + 276 (value.c:78)
>>>>>>>>>>> 1   slapd    0x0001bdcc attr_merge + 192 (attr.c:216)
>>>>>>>>>>> 2   slapd    0x0001bf28 attr_merge_normalize + 276 (attr.c:261)
>>>>>>>>>>> 3   slapd    0x0000abb4 config_build_attrs + 184
>>>>>>>>>>> (bconfig.c:3871)
>>>>>>>>>>> 4   slapd    0x0000ade0 config_build_entry + 484
>>>>>>>>>>> (bconfig.c:3940)
>>>>>>>>>>> 5   slapd    0x0000b1e8 config_back_db_open + 272
>>>>>>>>>>> (bconfig.c:4086)
>>>>>>>>>>> 6   slapd    0x0001e504 backend_startup_one + 276
>>>>>>>>>>> (backend.c:213)
>>>>>>>>>>> 7   slapd    0x0001e8cc backend_startup + 828 (backend.c:303)
>>>>>>>>>>> 8   slapd    0x00003ea0 main + 3184 (main.c:727)
>>>>>>>>>>> 9   slapd    0x00002944 _start + 348 (crt.c:272)
>>>>>>>>>>> 10  slapd    0x000027e4 start + 60
>
>
>--
>   -- Howard Chu
>   Chief Architect, Symas Corp.  http://www.symas.com
>   Director, Highland Sun        http://highlandsun.com/hyc
>   OpenLDAP Core Team            http://www.openldap.org/project/


-- 

Forest Hill
Apple Computer, Inc.
408-426-4141
forest@apple.com