[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Segfault in slapd 2.4.8 - syncrepl and glue (ITS#5430)



<quote who="Duncan.Gibb@SiriusIT.co.uk">
> Thanks, Gavin.  My GDB foo is not up to getting useful info out of your
> core dump, but I have placed one from one of our test rigs at
>
> http://pastebin.siriusit.co.uk/slapd-2.4.8-nullpointer-bin-and-core.tar.gz
>
> Note that this expects to be looked at with 64-bit GDB 6.7.1 from Debian
> Lenny/Sid (GDB 6.4.90 from Etch doesn't work).

OK, thanks. I take it 2.4.8 is a self-compile?

>
>
> Re your previous comments:
>
> The engineer who put together the test case did look for bug reporting
> guidelines, but their location in the FAQ is, IMHO, non-obvious; perhaps
> one day I'll fix that.  I reviewed and edited the report before sending
> it and I think it meets any reasonable person's definition of at least
> attempting to be helpful whilst not unduly verbose.

True, but at least you had a version number, platform and bt. Many don't
give this ;-) It's the details.

I wrote up a useful page in the Admin guide and would appreciate any
feedback you have as it is obviously not easily found by our users:

http://www.openldap.org/doc/admin24/troubleshooting.html

>
> Posting to the wrong list was my fault and I apologise.  The moderator
> was correct to reject the original message.

;-)

>
> We know all kinds of weird stuff happens if you restart the crashed
> server(s) - that's how we come to be looking at this in the first place.

Of course.

>  We have, however, tried very hard to come up with a simple, confined
> case of this particular null pointer dereference, which is the first
> thing that goes wrong if you start from a clean sheet with empty
> databases each time and follow the recipe.
>
> In the four environments we can easily construct (RHEL vs Debian, i386
> vs amd64) the segfault of server1 happens immediately you submit any
> change via LDAP to server2.
>
> Can anyone else reproduce this, or do we need to work on more examples?

I'm going to try and reproduce it reliably shortly.

Lastly, without knowing your project requirements, I would say that this
deployment could be greatly simplified by having one Provider and 2
Consumers. The second server (server2 in your example) could use the chain
overlay to push back changes. Again, without knowing the big picture, it's
hard to comment.

This discussion is best carried out on -software or -technical however, so
let's concentrate on the bug.

Thanks,

Gavin.