Issue 6856 - OpenLDAP 2.4.23 / db 4.7.25 lockup 100% CPU
Summary: OpenLDAP 2.4.23 / db 4.7.25 lockup 100% CPU
Status: VERIFIED FIXED
Alias: None
Product: OpenLDAP
Classification: Unclassified
Component: slapd (show other issues)
Version: 2.4.23
Hardware: All All
: --- normal
Target Milestone: ---
Assignee: OpenLDAP project
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-03-07 14:42 UTC by requate@univention.de
Modified: 2014-08-01 21:04 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description requate@univention.de 2011-03-07 14:42:33 UTC
Full_Name: Arvid Requate
Version: 2.4.23
OS: Debian Lenny
URL: http://apt.univention.de/download/temp/openldap/trace_openldap_2.4.23_db_4.7.25.tar.gz
Submission from: (NULL) (82.198.197.8)


With OpenLDAP 2.4.23 and bdb 4.7.25 we seem to hit something like a race
condition that can be triggered by concurrent ldapdelete and search_s
operations.
Though a bit simmilar, this condition does no quite match the details of
ITS#5707. The URL provides a tar archive containing three gdb traces and
corresponding slapd log output (loglevel: trace args stats) of three cases of
lockup, where slapd hangs consuming 100% of CPU after a couple of modifications
with the shell script contained in the tar archive and remains unresponsive
until restartet.The number of successful operations varies between the test
runs.

Berkeley DB 4.7.25 (May 15, 2008) was built with Oracle patches for Bugs #16415
and #16541 and configure options "--enable-posixmutexes
--with-mutex=POSIX/pthreads".

The test machine is a single processor/single core 686 VM running Linux 2.6.32
686 bigmem. The concurrent searches are performed by a separate process that
gets informed about ldap modifications (via file) by an slapd overlay module
called 'translog'. To me the traces do not seem to indicate a problem in the
overlay code (i.e. there is no reference to the on_response function
"translog_response" in the traces).

Maybe there is some obvious point here we are missing? More debug details can be
provided if necessary.
Comment 1 Howard Chu 2011-03-07 22:17:04 UTC
requate@univention.de wrote:
> Full_Name: Arvid Requate
> Version: 2.4.23
> OS: Debian Lenny
> URL: http://apt.univention.de/download/temp/openldap/trace_openldap_2.4.23_db_4.7.25.tar.gz
> Submission from: (NULL) (82.198.197.8)
>
>
> With OpenLDAP 2.4.23 and bdb 4.7.25 we seem to hit something like a race
> condition that can be triggered by concurrent ldapdelete and search_s
> operations.
> Though a bit simmilar, this condition does no quite match the details of
> ITS#5707. The URL provides a tar archive containing three gdb traces and
> corresponding slapd log output (loglevel: trace args stats) of three cases of
> lockup, where slapd hangs consuming 100% of CPU after a couple of modifications
> with the shell script contained in the tar archive and remains unresponsive
> until restartet.The number of successful operations varies between the test
> runs.
>
> Berkeley DB 4.7.25 (May 15, 2008) was built with Oracle patches for Bugs #16415
> and #16541 and configure options "--enable-posixmutexes
> --with-mutex=POSIX/pthreads".
>
> The test machine is a single processor/single core 686 VM running Linux 2.6.32
> 686 bigmem. The concurrent searches are performed by a separate process that
> gets informed about ldap modifications (via file) by an slapd overlay module
> called 'translog'. To me the traces do not seem to indicate a problem in the
> overlay code (i.e. there is no reference to the on_response function
> "translog_response" in the traces).
>
> Maybe there is some obvious point here we are missing? More debug details can be
> provided if necessary.

Try again using 2.4.24. There was a bug with back-bdb delete fixed recently 
(ITS#6577) so the relevant code has changed since .23.

Also try a newer BerkeleyDB. We've had other deadlocks with 4.7 that no longer 
occur in 4.8.

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 2 Howard Chu 2011-03-08 04:32:48 UTC
changed state Open to Feedback
Comment 3 requate@univention.de 2013-07-29 17:57:45 UTC
Looks like the patch for ITS#7222 fixed this.

-- 
Arvid Requate
Open Source Software Engineer

Univention GmbH
be open.
Mary-Somerville-Str.1
28359 Bremen
Tel. : +49 421 22232-52
Fax : +49 421 22232-99

requate@univention.de
http://www.univention.de

Geschäftsführer: Peter H. Ganten
HRB 20755 Amtsgericht Bremen
Steuer-Nr.: 71-597-02876

Comment 4 Howard Chu 2013-07-29 18:13:26 UTC
requate@univention.de wrote:
> Looks like the patch for ITS#7222 fixed this.
>
Thanks for the followup. Closing this ITS.

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 5 Howard Chu 2013-07-29 18:13:53 UTC
changed notes
changed state Feedback to Closed
Comment 6 OpenLDAP project 2014-08-01 21:04:00 UTC
fixed by #7222