Issue 7705 - mdb back-end segfaults sporadically with paged searches
Summary: mdb back-end segfaults sporadically with paged searches
Status: VERIFIED FIXED
Alias: None
Product: OpenLDAP
Classification: Unclassified
Component: slapd (show other issues)
Version: 2.4.36
Hardware: All All
: --- normal
Target Milestone: ---
Assignee: OpenLDAP project
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-09-23 22:09 UTC by openldap@semyon.org
Modified: 2014-10-23 07:30 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description openldap@semyon.org 2013-09-23 22:09:08 UTC
Full_Name: Semyon Chaichenets
Version: 2.4.36
OS: Linux 3.2.0-51-generic #77-Ubuntu SMP Wed Jul 24 20:18:19 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (129.128.11.113)


We have an issue whereby a paged-search operation causes slapd to segfault. At
this point, I can only reproduce the issue on our production data - for which I
cannot share coredumps. I am working on isolating the issue but would appreciate
any help from openLDAP developers.

The issue is triggered by the following search:

ldapsearch -H ldap://ldaphost -b 'ou=people,dc=our,dc=org' -x -E pr=126/noprompt
'(uid=m*)'

We have ~59k entries starting matching uid=m*; the issue occurs unpredictably in
the sense that it may or may not happen for filters matching other sets, but it
can be reproduced reliably for any given filter/page size combination. On our
setup, I could reproduce it for page-sizes 81,84,87,96,112,114,116,126.

The issue is specific to back-mdb; backtrace points to dn2id.c:

#0  0x00007f6d6a5ea6a1 in mdb_idscopes (op=0x7f695c110df0, isc=0x7f69692ca670)
    at ../../../../../servers/slapd/back-mdb/dn2id.c:738
#1  0x00007f6d6a5e399b in mdb_search (op=0x7f695c110df0, rs=0x7f69692dba00)
    at ../../../../../servers/slapd/back-mdb/search.c:747

[..]

I would greatly appreciate any help you could give me with this problem.


Best,
Semyon
Comment 1 Quanah Gibson-Mount 2013-09-23 22:32:08 UTC
--On Monday, September 23, 2013 10:09 PM +0000 openldap@semyon.org wrote:

> Full_Name: Semyon Chaichenets
> Version: 2.4.36
> OS: Linux 3.2.0-51-generic #77-Ubuntu SMP Wed Jul 24 20:18:19 UTC 2013
> x86_64 x86_64 x86_64 GNU/Linux URL: ftp://ftp.openldap.org/incoming/
> Submission from: (NULL) (129.128.11.113)
>
>
> We have an issue whereby a paged-search operation causes slapd to
> segfault. At this point, I can only reproduce the issue on our production
> data - for which I cannot share coredumps. I am working on isolating the
> issue but would appreciate any help from openLDAP developers.

Was this database created with OpenLDAP 2.4.36 back-mdb, or did it 
originate with a previous release of OpenLDAP?

--Quanah


--

Quanah Gibson-Mount
Lead Engineer
Zimbra Software, LLC
--------------------
Zimbra ::  the leader in open source messaging and collaboration

Comment 2 Howard Chu 2013-09-24 04:16:44 UTC
openldap@semyon.org wrote:
> Full_Name: Semyon Chaichenets
> Version: 2.4.36
> OS: Linux 3.2.0-51-generic #77-Ubuntu SMP Wed Jul 24 20:18:19 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
> URL: ftp://ftp.openldap.org/incoming/
> Submission from: (NULL) (129.128.11.113)
>
>
> We have an issue whereby a paged-search operation causes slapd to segfault. At
> this point, I can only reproduce the issue on our production data - for which I
> cannot share coredumps. I am working on isolating the issue but would appreciate
> any help from openLDAP developers.
>
> The issue is triggered by the following search:
>
> ldapsearch -H ldap://ldaphost -b 'ou=people,dc=our,dc=org' -x -E pr=126/noprompt
> '(uid=m*)'
>
> We have ~59k entries starting matching uid=m*; the issue occurs unpredictably in
> the sense that it may or may not happen for filters matching other sets, but it
> can be reproduced reliably for any given filter/page size combination. On our
> setup, I could reproduce it for page-sizes 81,84,87,96,112,114,116,126.
>
> The issue is specific to back-mdb; backtrace points to dn2id.c:
>
> #0  0x00007f6d6a5ea6a1 in mdb_idscopes (op=0x7f695c110df0, isc=0x7f69692ca670)
>      at ../../../../../servers/slapd/back-mdb/dn2id.c:738
> #1  0x00007f6d6a5e399b in mdb_search (op=0x7f695c110df0, rs=0x7f69692dba00)
>      at ../../../../../servers/slapd/back-mdb/search.c:747
>
> [..]
>
> I would greatly appreciate any help you could give me with this problem.

If you backup the DB with slapcat and reload it on another server with 
slapadd, can you still reproduce the fault on the copy?
>
>
> Best,
> Semyon
>
>


-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 3 openldap@semyon.org 2013-09-24 15:51:43 UTC
I have recloned the database overnight to make sure - the problem persists.

Semyon

On Mon, Sep 23, 2013 at 4:32 PM, Quanah Gibson-Mount <quanah@zimbra.com>wrote:

> [..]
> Was this database created with OpenLDAP 2.4.36 back-mdb, or did it
> originate with a previous release of OpenLDAP?
>
Comment 4 openldap@semyon.org 2013-09-24 17:57:54 UTC
On 13-09-23 10:16 PM, Howard Chu wrote:

> If you backup the DB with slapcat and reload it on another server with
> slapadd, can you still reproduce the fault on the copy?

Unfortunately, yes. It breaks trying to retrieve the same record as
before, on the same instruction in back_mdb:

yesterday:
[126393.233615] slapd[19244]: segfault at 7f499f38d000 ip
00007f4da0e9b6a1 sp 00007f499f1fa540 error 6 in
back_mdb-2.4.so.2.9.2[7f4da0e80000+34000]

today:
[517860.953779] slapd[24871]: segfault at 7fdeb50a0000 ip
00007fe2b63ad6a1 sp 00007fdeb4f0d540 error 6 in
back_mdb-2.4.so.2.9.2[7fe2b6392000+34000]

Semyon

Comment 5 Howard Chu 2013-09-30 23:16:09 UTC
changed notes
changed state Open to Suspended
Comment 6 Quanah Gibson-Mount 2013-10-18 20:52:44 UTC
--On Tuesday, September 24, 2013 5:58 PM +0000 openldap@semyon.org wrote:

> On 13-09-23 10:16 PM, Howard Chu wrote:
>
>> If you backup the DB with slapcat and reload it on another server with
>> slapadd, can you still reproduce the fault on the copy?
>
> Unfortunately, yes. It breaks trying to retrieve the same record as
> before, on the same instruction in back_mdb:
>
> yesterday:
> [126393.233615] slapd[19244]: segfault at 7f499f38d000 ip
> 00007f4da0e9b6a1 sp 00007f499f1fa540 error 6 in
> back_mdb-2.4.so.2.9.2[7f4da0e80000+34000]
>
> today:
> [517860.953779] slapd[24871]: segfault at 7fdeb50a0000 ip
> 00007fe2b63ad6a1 sp 00007fdeb4f0d540 error 6 in
> back_mdb-2.4.so.2.9.2[7fe2b6392000+34000]

Please provide a useful bug report.

I.e., get a stack trace from gdb, as noted in:

<http://www.openldap.org/faq/data/cache/59.html>

Sample data/testcase/configs always welcome as well.

--Quanah



--

Quanah Gibson-Mount
Architect - Server
Zimbra, Inc.
--------------------
Zimbra ::  the leader in open source messaging and collaboration

Comment 7 Quanah Gibson-Mount 2013-10-28 15:43:15 UTC
changed notes
changed state Suspended to Closed
Comment 8 Quanah Gibson-Mount 2013-10-28 15:44:34 UTC
changed notes
Comment 9 openldap@semyon.org 2013-10-28 22:41:54 UTC
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

The problem seems to be resolved in OPENLDAP_REL_ENG_2_4_37. Please
close this ticket.

Best
Semyon
-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.20 (Darwin)
Comment: GPGTools - https://gpgtools.org
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQGcBAEBCgAGBQJSbugyAAoJEDnsppBba4kKtNcL/0YHZaQfYOL7X5LBkYlsHX86
JtLPPT8mCax9ehrTQUmz5ZzuzV2YkQ2KnzO45Gb8K721bNTCuNBwlXvT33yh+I0T
QCEOAdMOxAasC27BW3L46Bbw+lW9/GPfxW8p6eGaw2qDCpIwRluxApysi+paRZR1
VCXu/z8/zLIEVMDKq/lktGnCBFR7uNBr575G4ePZVgytDO8aEk/F2VE+LQRBbfVL
25zlQesiO8EEPk5blbBhRFxTUZEBs/0EAnBB7hoBVKDlbog4tjOav6aNq1o4TegS
+0oNgZS+K0Krop/3v27dr+LzaFy07jySE4aDgXKstgqzM4uvZzCSxarsFPZyUqbd
sJNXBEKM5HUMn6orl6Ip7Z7h0MSkPYAH7OZLWXYTOmgl5wgJ8ArQSMMTD9+vC/yk
McIDElkqCoRce/IRfRYjS+Ct+qaEB7AYKhtbzVAwPjJBIj4S5N24xUShjDCvkjay
K6oj3GQFq78jp2ehITHn9zytiw12gmhv+V42hEQY+g==
=VzO+
-----END PGP SIGNATURE-----

Comment 10 Howard Chu 2014-05-15 21:07:47 UTC
changed notes
Comment 11 Howard Chu 2014-05-15 21:07:48 UTC
changed state Closed to Test
moved from Incoming to Software Bugs
Comment 12 Howard Chu 2014-05-16 04:08:52 UTC
openldap@semyon.org wrote:
> On 13-09-23 10:16 PM, Howard Chu wrote:
>
>> If you backup the DB with slapcat and reload it on another server with
>> slapadd, can you still reproduce the fault on the copy?
>
> Unfortunately, yes. It breaks trying to retrieve the same record as
> before, on the same instruction in back_mdb:
>
> yesterday:
> [126393.233615] slapd[19244]: segfault at 7f499f38d000 ip
> 00007f4da0e9b6a1 sp 00007f499f1fa540 error 6 in
> back_mdb-2.4.so.2.9.2[7f4da0e80000+34000]
>
> today:
> [517860.953779] slapd[24871]: segfault at 7fdeb50a0000 ip
> 00007fe2b63ad6a1 sp 00007fdeb4f0d540 error 6 in
> back_mdb-2.4.so.2.9.2[7fe2b6392000+34000]
>
> Semyon

A fix is now in git master.

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 13 Quanah Gibson-Mount 2014-05-29 10:53:10 UTC
changed notes
changed state Test to Release
Comment 14 OpenLDAP project 2014-10-23 07:30:55 UTC
fixed in master
fixed in RE25
fixed in RE24
Comment 15 Quanah Gibson-Mount 2014-10-23 07:30:55 UTC
changed notes
changed state Release to Closed