OpenLDAP
Up to top level
Build   Contrib   Development   Documentation   Historical   Incoming   Software Bugs   Software Enhancements   Web  

Logged in as guest

Viewing Incoming/7655
Full headers

From: hans.freitag@entiretec.com
Subject: segfault during initial mirror of multimaster delta replication
Compose comment
Download message
State:
0 replies:
2 followups: 1 2

Major security issue: yes  no

Notes:

Notification:


Date: Sun, 04 Aug 2013 16:27:19 +0000
From: hans.freitag@entiretec.com
To: openldap-its@OpenLDAP.org
Subject: segfault during initial mirror of multimaster delta replication
Full_Name: Hans Freitag
Version: 2.4.35 and 33
OS: SLES 11SP2
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (193.200.138.3)


I have a Multimaster Delta replication setup here with bdb on a 18 GB Database.

After a crash due to a full disk I made a new database on one node ans started
over. 

The empty node started to replicate, from the full one but after a while
(approx. 2GB) it crashed with a segfault: 

Aug  4 11:45:32 mhr-dd-lda-01 kernel: [52189.476209] slapd[10158]: segfault at
20 ip 00007ff97ebfabc0 sp 00007ff6e57e6b38 error 4 in
libc-2.11.1.so[7ff97eb79000+155000] 

So i thought, maybe it is not e good Idea to put in a package for SP2 in a
machine running SP1 so my first attempt to solve was an upgrade. After the
upgrade I got this: 

Aug  4 12:46:29 mhr-dd-lda-01 kernel: [ 1414.757587] slapd[3704]: segfault at 20
ip 00007fc82eee6182 sp 00007fc592e0acf0 error 4 in slapd[7fc82ee7a000+1e6000]

So I created a brandnew openldap RPM 2.4.35 rpm to try out if the problem is
maybe related to the 2.4.33 version I am running. But fail: 

Aug  4 13:47:19 mhr-dd-lda-01 kernel: [ 5063.074410] slapd[8749]: segfault at 20
ip 00007fcbc1b537dc sp 00007fc92624fb88 error 4 in slapd[7fcbc1ac8000+1ea000]

At the moment I deactivated the accesslogging on the node which seems to work. I
will know for sure in a few hours. ;-) I can try to reproduce that on a backup
node next week. Whenn all the main nodes are up and running again. :) 

Followup 1

Download message
Date: Sun, 04 Aug 2013 20:11:05 -0700
From: Quanah Gibson-Mount <quanah@zimbra.com>
To: hans.freitag@entiretec.com, openldap-its@openldap.org
Subject: Re: (ITS#7655) segfault during initial mirror of multimaster delta
 replication
--On Sunday, August 04, 2013 4:27 PM +0000 hans.freitag@entiretec.com wrote:

> Full_Name: Hans Freitag
> Version: 2.4.35 and 33
> OS: SLES 11SP2
> URL: ftp://ftp.openldap.org/incoming/
> Submission from: (NULL) (193.200.138.3)
>
>
> I have a Multimaster Delta replication setup here with bdb on a 18 GB
> Database.
>
> After a crash due to a full disk I made a new database on one node ans
> started over.
>
> The empty node started to replicate, from the full one but after a while
> (approx. 2GB) it crashed with a segfault:
>
> Aug  4 11:45:32 mhr-dd-lda-01 kernel: [52189.476209] slapd[10158]:
> segfault at 20 ip 00007ff97ebfabc0 sp 00007ff6e57e6b38 error 4 in
> libc-2.11.1.so[7ff97eb79000+155000]
>
> So i thought, maybe it is not e good Idea to put in a package for SP2 in a
> machine running SP1 so my first attempt to solve was an upgrade. After the
> upgrade I got this:
>
> Aug  4 12:46:29 mhr-dd-lda-01 kernel: [ 1414.757587] slapd[3704]:
> segfault at 20 ip 00007fc82eee6182 sp 00007fc592e0acf0 error 4 in
> slapd[7fc82ee7a000+1e6000]
>
> So I created a brandnew openldap RPM 2.4.35 rpm to try out if the problem
> is maybe related to the 2.4.33 version I am running. But fail:
>
> Aug  4 13:47:19 mhr-dd-lda-01 kernel: [ 5063.074410] slapd[8749]:
> segfault at 20 ip 00007fcbc1b537dc sp 00007fc92624fb88 error 4 in
> slapd[7fcbc1ac8000+1ea000]
>
> At the moment I deactivated the accesslogging on the node which seems to
> work. I will know for sure in a few hours. ;-) I can try to reproduce
> that on a backup node next week. Whenn all the main nodes are up and
> running again. :)

I would suggest you build with debugging symbols, enable core files, and 
provide a backtrace of the problem.  What you have provided does not give 
any useful information for debugging purposes.  You also fail to state the 
backend you are using (back-bdb or back-hdb).

For information on how to provide a backtrace:

<http://www.openldap.org/faq/data/cache/59.html>

Regards,
Quanah

--

Quanah Gibson-Mount
Lead Engineer
Zimbra, Inc
--------------------
Zimbra ::  the leader in open source messaging and collaboration



Followup 2

Download message
From: Hans Freitag <hans.freitag@entiretec.com>
To: "quanah@zimbra.com" <quanah@zimbra.com>,
        "openldap-its@openldap.org"
	<openldap-its@openldap.org>
Subject: AW: (ITS#7655) segfault during initial mirror of multimaster delta
 replication
Date: Wed, 18 Sep 2013 13:13:32 +0000
Hi, 

unfortunately i was not able to reproduce the exact problem with the segfault,
but, after a few updates, 
we still have the problem that with replication enabled the slapd freezes during
a write operation.

SETUP DESCRIPTION: 

Openldap Version 2.4.36  
Back-MDB (we have issues for quite a while, even when we where running on bdb) 
 

All write and read requests are directed to the active node, so the passive 
node is replicating. 

So, if I did not understand something wrong I have two threads: The main thread,

and the one which is doing the replication.



Netstat of TCP Replication connections, the second is initiated by the 
passive system polling from the active

tcp        0     53 10.169.127.13:389       10.169.126.13:43340     ESTABLISHED
tcp   1905336      0 10.169.127.13:52384     10.169.126.13:389       ESTABLISHED


top -H of the LDAP Processes: 

 7767 ldap      20   0 84.4g 7.1g 6.9g S      1 10.1   1:02.13 slapd
 7768 ldap      20   0 84.4g 7.1g 6.9g S      0 10.1   7:54.44 slapd
 8023 ldap      20   0 84.4g 7.1g 6.9g S      0 10.1   0:32.31 slapd
 7766 ldap      20   0 84.4g 7.1g 6.9g S      0 10.1   0:00.00 slapd
 7769 ldap      20   0 84.4g 7.1g 6.9g S      0 10.1   0:32.81 slapd
 7770 ldap      20   0 84.4g 7.1g 6.9g S      0 10.1   7:44.94 slapd
 8024 ldap      20   0 84.4g 7.1g 6.9g t      0 10.1   0:32.53 slapd

PASTEBIN: 

I Pastebinned all the backtraces to: 

http://pastebin.com/vVGEqEUt


I hope this helps to track back the problem. 


Kind regards - Mit freundlichen Gr..en 

i.A. Hans Freitag
. Linux Administrator

ENTIRETEC AG . Pforzheimer Strasse 33 . 01189 Dresden . Germany
T: +49.351.41355.0 . M:  . F: +49.351.41355.99
E: hans.freitag@entiretec.com

ENTIRETEC | http://www.entiretec.com
Germany | Switzerland | United Arab Emirates | Malaysia | United States of
America

ENTIRETEC AG
Vorstand: Thomas Herrmann (Vorsitzender), Thomas Wetzel, Carsten Klemm .
Aufsichtsratsvorsitzende: Dr. Jutta Horezky
Sitz der Gesellschaft: Dresden . Amtsgericht Dresden HRB 24915 . USt-IdNr.
DE227705033



> -----Urspr.ngliche Nachricht-----
> Von: openldap-bugs-bounces@OpenLDAP.org [mailto:openldap-bugs-
> bounces@OpenLDAP.org] Im Auftrag von quanah@zimbra.com
> Gesendet: Montag, 5. August 2013 05:15
> An: openldap-its@openldap.org
> Betreff: Re: (ITS#7655) segfault during initial mirror of multimaster
> delta replication
> 
> --On Sunday, August 04, 2013 4:27 PM +0000 hans.freitag@entiretec.com
> wrote:
> 
> > Full_Name: Hans Freitag
> > Version: 2.4.35 and 33
> > OS: SLES 11SP2
> > URL: ftp://ftp.openldap.org/incoming/
> > Submission from: (NULL) (193.200.138.3)
> >
> >
> > I have a Multimaster Delta replication setup here with bdb on a 18 GB
> > Database.
> >
> > After a crash due to a full disk I made a new database on one node
> ans
> > started over.
> >
> > The empty node started to replicate, from the full one but after a
> while
> > (approx. 2GB) it crashed with a segfault:
> >
> > Aug  4 11:45:32 mhr-dd-lda-01 kernel: [52189.476209] slapd[10158]:
> > segfault at 20 ip 00007ff97ebfabc0 sp 00007ff6e57e6b38 error 4 in
> > libc-2.11.1.so[7ff97eb79000+155000]
> >
> > So i thought, maybe it is not e good Idea to put in a package for SP2
> in a
> > machine running SP1 so my first attempt to solve was an upgrade.
> After the
> > upgrade I got this:
> >
> > Aug  4 12:46:29 mhr-dd-lda-01 kernel: [ 1414.757587] slapd[3704]:
> > segfault at 20 ip 00007fc82eee6182 sp 00007fc592e0acf0 error 4 in
> > slapd[7fc82ee7a000+1e6000]
> >
> > So I created a brandnew openldap RPM 2.4.35 rpm to try out if the
> problem
> > is maybe related to the 2.4.33 version I am running. But fail:
> >
> > Aug  4 13:47:19 mhr-dd-lda-01 kernel: [ 5063.074410] slapd[8749]:
> > segfault at 20 ip 00007fcbc1b537dc sp 00007fc92624fb88 error 4 in
> > slapd[7fcbc1ac8000+1ea000]
> >
> > At the moment I deactivated the accesslogging on the node which seems
> to
> > work. I will know for sure in a few hours. ;-) I can try to reproduce
> > that on a backup node next week. Whenn all the main nodes are up and
> > running again. :)
> 
> I would suggest you build with debugging symbols, enable core files,
> and
> provide a backtrace of the problem.  What you have provided does not
> give
> any useful information for debugging purposes.  You also fail to state
> the
> backend you are using (back-bdb or back-hdb).
> 
> For information on how to provide a backtrace:
> 
> <http://www.openldap.org/faq/data/cache/59.html>
> 
> Regards,
> Quanah
> 
> --
> 
> Quanah Gibson-Mount
> Lead Engineer
> Zimbra, Inc
> --------------------
> Zimbra ::  the leader in open source messaging and collabo

Message of length 5015 truncated

Up to top level
Build   Contrib   Development   Documentation   Historical   Incoming   Software Bugs   Software Enhancements   Web  

Logged in as guest


The OpenLDAP Issue Tracking System uses a hacked version of JitterBug

______________
© Copyright 2013, OpenLDAP Foundation, info@OpenLDAP.org