Issue 4216 - slapd terminates unexpectedly
Summary: slapd terminates unexpectedly
Status: VERIFIED FIXED
Alias: None
Product: OpenLDAP
Classification: Unclassified
Component: slapd (show other issues)
Version: unspecified
Hardware: All All
: --- normal
Target Milestone: ---
Assignee: OpenLDAP project
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-11-25 15:43 UTC by james.macnider@bridgewatersystems.com
Modified: 2014-08-01 21:05 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description james.macnider@bridgewatersystems.com 2005-11-25 15:43:57 UTC
Full_Name: James MacNider
Version: 2.1.6 and 2.2.29
OS: Solaris 8
URL: 
Submission from: (NULL) (216.113.7.2)


I think this may be the same issue as was previously reported in ITS# 3567 which
was previously closed.

I have been seen the slapd server exit (gracefully, no core) under high load
when using the NULL backend that is included as part of the OpenLDAP
distribution.  I've been able to reproduce this almost at will with versions
2.1.6 and 2.2.29 of the OpenLDAP slapd process.  I'm using SLAMD (www.slamd.org)
to generate the load. 

Tracing through the code it again seems to be that the problem is getting an
ENOSYS code back from the call to select(), as in ITS# 3567.  Because I'm using
the null backend I can't see how this defect could be in the backend code as
suggested in ITS# 3567.  It is of course possible that this is a defect in
Solaris and not OpenLDAP, but that has yet to be proven one way or the other.

Comment 1 james.macnider@bridgewatersystems.com 2005-11-25 15:59:12 UTC
Note: www.slamd.org should read www.slamd.com 

-----Original Message-----
From: openldap-its@OpenLDAP.org [mailto:openldap-its@OpenLDAP.org] 
Sent: Friday, November 25, 2005 10:44 AM
To: James MacNider
Subject: Re: (ITS#4216) slapd terminates unexpectedly


*** THIS IS AN AUTOMATICALLY GENERATED REPLY ***

Thanks for your report to the OpenLDAP Issue Tracking System.  Your
report has been assigned the tracking number ITS#4216.

One of our support engineers will look at your report in due course.
Note that this may take some time because our support engineers are
volunteers.  They only work on OpenLDAP when they have spare time.

If you need to provide additional information in regards to your issue
report, you may do so by replying to this message.  Note that any mail
sent to openldap-its@openldap.org with (ITS#4216) in the subject will
automatically be attached to the issue report.

	mailto:openldap-its@openldap.org?subject=(ITS#4216)

You may follow the progress of this report by loading the following URL
in a web browser:
    http://www.OpenLDAP.org/its/index.cgi?findid=4216

Please remember to retain your issue tracking number (ITS#4216) on any
further messages you send to us regarding this report.  If you don't
then you'll just waste our time and yours because we won't be able to
properly track the report.

Please note that the Issue Tracking System is not intended to be used to
seek help in the proper use of OpenLDAP Software.
Such requests will be closed.

OpenLDAP Software is user supported.
	http://www.OpenLDAP.org/support/

--------------
Copyright 1998-2005 The OpenLDAP Foundation, All Rights Reserved.


Comment 2 james.macnider@bridgewatersystems.com 2005-11-25 16:22:30 UTC
It's also worth noting that the following OS settings were in place when
the exit issue occurs:

ndd -set /dev/tcp tcp_time_wait_interval 240000
ndd -set /dev/tcp tcp_conn_req_max_q 128
ndd -set /dev/tcp tcp_keepalive_interval 7200000
ndd -set /dev/tcp tcp_deferred_ack_interval 100
ndd -set /dev/tcp tcp_smallest_anon_port 32768
ulimit -n 256

However when the OS is tuned more aggressively (as below) I've been
unable to reproduce the problem:

ndd -set /dev/tcp tcp_time_wait_interval 30000
ndd -set /dev/tcp tcp_conn_req_max_q 1024
ndd -set /dev/tcp tcp_keepalive_interval 600000
ndd -set /dev/tcp tcp_deferred_ack_interval 5
ndd -set /dev/tcp tcp_smallest_anon_port 8192
ulimit -n 4096

Comment 3 Howard Chu 2005-11-25 19:11:20 UTC
The bug reported in #3567 was clearly due to faulty (perl) application 
code. As such, the bug you're reporting here seems to have a different 
cause. Please try reproducing it in 2.3.12. I'll note that we at Symas 
have subjected 2.3 slapd to tremendous loads using slamd, and have not 
seen any such failure, but we were running under Linux.

james.macnider@bridgewatersystems.com wrote:
> Full_Name: James MacNider
> Version: 2.1.6 and 2.2.29
> OS: Solaris 8
> URL: 
> Submission from: (NULL) (216.113.7.2)
>
>
> I think this may be the same issue as was previously reported in ITS# 3567 which
> was previously closed.
>
> I have been seen the slapd server exit (gracefully, no core) under high load
> when using the NULL backend that is included as part of the OpenLDAP
> distribution.  I've been able to reproduce this almost at will with versions
> 2.1.6 and 2.2.29 of the OpenLDAP slapd process.  I'm using SLAMD (www.slamd.org)
> to generate the load. 
>
> Tracing through the code it again seems to be that the problem is getting an
> ENOSYS code back from the call to select(), as in ITS# 3567.  Because I'm using
> the null backend I can't see how this defect could be in the backend code as
> suggested in ITS# 3567.  It is of course possible that this is a defect in
> Solaris and not OpenLDAP, but that has yet to be proven one way or the other.
>
>
>   


-- 
  -- Howard Chu
  Chief Architect, Symas Corp.  http://www.symas.com
  Director, Highland Sun        http://highlandsun.com/hyc
  OpenLDAP Core Team            http://www.openldap.org/project/

Comment 4 james.macnider@bridgewatersystems.com 2005-11-28 16:32:06 UTC
We were able to reproduce the defect with version 2.3.12 on the first
try.  The server, as before, is configured to use the null backend.  The
queries were again driven by SLAMD using the SearchClientJobClass. Our
search filter simply contained: (objectclass=*).  My slapd.conf is
below:

-----------------------

ucdata-path
/dev/src/ldap/openldap/null/openldap-2.3.12/libraries/liblunicode
 
backend null
database null
suffix   ""
rootdn   "cn=root"
bind     on
 
# Cleartext passwords, especially for the rootdn, should # be avoid.
See slappasswd(8) and slapd.conf(5) for details.
# Use of strong authentication encouraged.
rootpw    secret

-----------------------



Comment 5 Howard Chu 2005-11-28 22:00:40 UTC
James MacNider wrote:
> We used 10 clients and 10 threads per client spread across 2 test driver
> machines (so a total of 50 threads on each machine).  The time it takes
> for the problem to occur varies...it has happened as quickly as 10
> seconds or as long as 20 minutes. 
>   
Just as an aside, in our experience using more than two threads per 
client doesn't actually produce any more useful server load.

By the way, is syslog active on slapd while running this test? Is slapd 
configured with reverse-lookups enabled? The point is to identify where 
any non-socket descriptors are coming from, if any. If you can run slapd 
under a debugger with a breakpoint at the point where the error is 
detected (daemon.c line 1862 would be a good choice), and then run lsof 
on the slapd process when the breakpoint is triggered, that may give 
some useful information. Also print slap_daemon, readfds, and writefds 
at that point.

-- 
  -- Howard Chu
  Chief Architect, Symas Corp.  http://www.symas.com
  Director, Highland Sun        http://highlandsun.com/hyc
  OpenLDAP Core Team            http://www.openldap.org/project/

Comment 6 Howard Chu 2005-11-29 03:44:44 UTC
James MacNider wrote:
> We used 10 clients and 10 threads per client spread across 2 test driver
> machines (so a total of 50 threads on each machine).  The time it takes
> for the problem to occur varies...it has happened as quickly as 10
> seconds or as long as 20 minutes. 
>   

We've been running the same test here and see no error.

-- 
  -- Howard Chu
  Chief Architect, Symas Corp.  http://www.symas.com
  Director, Highland Sun        http://highlandsun.com/hyc
  OpenLDAP Core Team            http://www.openldap.org/project/

Comment 7 james.macnider@bridgewatersystems.com 2005-11-29 13:33:29 UTC
The problem may well be specific to Solaris then (I recall you said that
your tests are running under Linux).  

My "configure" was as follows:

configure --enable-bdb=no --enable-null=yes --enable-hdb=no

Every other item was left as the default value.  I did capture the
configure/make/make depend/make test output to file.  If you're
interested in any of that just let me know and I'll send it on to you.

I did have one question for you.  You asked me if was syslog active on
slapd while running the tests. I'm not quite sure what you meant by
that... syslogd is running on the box, but I don't think that's quite
what you were getting at.

I'll do my best to gather the other information you've requested. 


-----Original Message-----
From: Howard Chu [mailto:hyc@symas.com] 
Sent: Monday, November 28, 2005 10:45 PM
To: James MacNider
Cc: openldap-its@openldap.org
Subject: Re: (ITS#4216) slapd terminates unexpectedly

James MacNider wrote:
> We used 10 clients and 10 threads per client spread across 2 test 
> driver machines (so a total of 50 threads on each machine).  The time 
> it takes for the problem to occur varies...it has happened as quickly 
> as 10 seconds or as long as 20 minutes.
>   

We've been running the same test here and see no error.

--
  -- Howard Chu
  Chief Architect, Symas Corp.  http://www.symas.com
  Director, Highland Sun        http://highlandsun.com/hyc
  OpenLDAP Core Team            http://www.openldap.org/project/


Comment 8 Quanah Gibson-Mount 2005-11-29 17:52:30 UTC

--On Tuesday, November 29, 2005 1:33 PM +0000 
james.macnider@bridgewatersystems.com wrote:

> The problem may well be specific to Solaris then (I recall you said that
> your tests are running under Linux).
>
> My "configure" was as follows:
>
> configure --enable-bdb=no --enable-null=yes --enable-hdb=no
>
> Every other item was left as the default value.  I did capture the
> configure/make/make depend/make test output to file.  If you're
> interested in any of that just let me know and I'll send it on to you.
>
> I did have one question for you.  You asked me if was syslog active on
> slapd while running the tests. I'm not quite sure what you meant by
> that... syslogd is running on the box, but I don't think that's quite
> what you were getting at.
>
> I'll do my best to gather the other information you've requested.

Actually, we were running it on a Solaris 8 system, with 256 file 
descriptors, with only the back-null database loaded, and the ndd 
parameters you specified in your post.  I eventually ramped it up to where 
I had 100 clients connecting, and let that run for an hour, which completed 
without problem.

Regards,
Quanah


--
Quanah Gibson-Mount
Product Engineer
Symas Corporation
Packaged, certified, and supported LDAP solutions powered by OpenLDAP:
<http://www.symas.com>


Comment 9 Howard Chu 2005-11-30 21:35:47 UTC
James MacNider wrote:
> The problem may well be specific to Solaris then (I recall you said that
> your tests are running under Linux).  
>   

(As Quanah already said, we reran this test on a Solaris 8 system.)
> My "configure" was as follows:
>
> configure --enable-bdb=no --enable-null=yes --enable-hdb=no
>
> Every other item was left as the default value.  I did capture the
> configure/make/make depend/make test output to file.  If you're
> interested in any of that just let me know and I'll send it on to you.
>
> I did have one question for you.  You asked me if was syslog active on
> slapd while running the tests. I'm not quite sure what you meant by
> that... syslogd is running on the box, but I don't think that's quite
> what you were getting at.
>   

The question is whether you have slapd set to log any messages to 
syslog, since that would introduce another file descriptor into the process.

-- 
  -- Howard Chu
  Chief Architect, Symas Corp.  http://www.symas.com
  Director, Highland Sun        http://highlandsun.com/hyc
  OpenLDAP Core Team            http://www.openldap.org/project/

Comment 10 Howard Chu 2005-12-01 00:08:25 UTC
changed state Open to Feedback
Comment 11 james.macnider@bridgewatersystems.com 2005-12-01 18:47:28 UTC
The only argument passed to slapd on the command line is the -f option
to specify the config file.  All other options were left at the default
values.  

-----Original Message-----
From: Howard Chu [mailto:hyc@symas.com] 
Sent: Wednesday, November 30, 2005 4:36 PM
To: James MacNider
Cc: openldap-its@openldap.org
Subject: Re: (ITS#4216) slapd terminates unexpectedly

James MacNider wrote:
> The problem may well be specific to Solaris then (I recall you said 
> that your tests are running under Linux).
>   

(As Quanah already said, we reran this test on a Solaris 8 system.)
> My "configure" was as follows:
>
> configure --enable-bdb=no --enable-null=yes --enable-hdb=no
>
> Every other item was left as the default value.  I did capture the
> configure/make/make depend/make test output to file.  If you're
> interested in any of that just let me know and I'll send it on to you.
>
> I did have one question for you.  You asked me if was syslog active on
> slapd while running the tests. I'm not quite sure what you meant by
> that... syslogd is running on the box, but I don't think that's quite
> what you were getting at.
>   

The question is whether you have slapd set to log any messages to 
syslog, since that would introduce another file descriptor into the
process.

-- 
  -- Howard Chu
  Chief Architect, Symas Corp.  http://www.symas.com
  Director, Highland Sun        http://highlandsun.com/hyc
  OpenLDAP Core Team            http://www.openldap.org/project/


Comment 12 Quanah Gibson-Mount 2005-12-02 18:22:41 UTC

--On Thursday, December 01, 2005 6:47 PM +0000 
james.macnider@bridgewatersystems.com wrote:

> The only argument passed to slapd on the command line is the -f option
> to specify the config file.  All other options were left at the default
> values.

I'm not sure how this relates the question about whether or not you are 
logging to syslog.

What is your loglevel setting in slapd.conf?  Did you explicitly disable 
syslog logging if you built the code from source?

--Quanah

--
Quanah Gibson-Mount
Product Engineer
Symas Corporation
Packaged, certified, and supported LDAP solutions powered by OpenLDAP:
<http://www.symas.com>


Comment 13 james.macnider@bridgewatersystems.com 2005-12-02 18:30:02 UTC
Hello Quanah,

As I mentioned earlier to Howard Chu, my slapd.conf simply contains the
following:

-------------
ucdata-path
/dev/src/ldap/openldap/null/openldap-2.3.12/libraries/liblunicode
 
backend null
database null
suffix   ""
rootdn   "cn=root"
bind     on
 
# Cleartext passwords, especially for the rootdn, should # be avoid.
See slappasswd(8) and slapd.conf(5) for details.
# Use of strong authentication encouraged.
rootpw    secret
-------------

So I never set a syslog level in slapd.conf.  What is that value by
default?

Also the only options I set during the whole configure/make/make test
sequence of the code was for the configure where I set these flags:
--enable-bdb=no --enable-null=yes --enable-hdb=no ...So from this syslog
should not have been disabled at build time either.

Thanks for continuing to follow up with this issue.
 

-----Original Message-----
From: Quanah Gibson-Mount [mailto:quanah@symas.com] 
Sent: Friday, December 02, 2005 1:23 PM
To: James MacNider; openldap-its@OpenLDAP.org
Subject: RE: (ITS#4216) slapd terminates unexpectedly



--On Thursday, December 01, 2005 6:47 PM +0000
james.macnider@bridgewatersystems.com wrote:

> The only argument passed to slapd on the command line is the -f option

> to specify the config file.  All other options were left at the 
> default values.

I'm not sure how this relates the question about whether or not you are
logging to syslog.

What is your loglevel setting in slapd.conf?  Did you explicitly disable
syslog logging if you built the code from source?

--Quanah

--
Quanah Gibson-Mount
Product Engineer
Symas Corporation
Packaged, certified, and supported LDAP solutions powered by OpenLDAP:
<http://www.symas.com>



Comment 14 Quanah Gibson-Mount 2005-12-02 18:37:27 UTC

--On Friday, December 02, 2005 1:30 PM -0500 James MacNider 
<james.macnider@bridgewatersystems.com> wrote:

> Hello Quanah,
>
> As I mentioned earlier to Howard Chu, my slapd.conf simply contains the
> following:
>
> -------------
> ucdata-path
> /dev/src/ldap/openldap/null/openldap-2.3.12/libraries/liblunicode
>
> backend null
> database null
> suffix   ""
> rootdn   "cn=root"
> bind     on
>
># Cleartext passwords, especially for the rootdn, should # be avoid.
> See slappasswd(8) and slapd.conf(5) for details.
># Use of strong authentication encouraged.
> rootpw    secret
> -------------
>
> So I never set a syslog level in slapd.conf.  What is that value by
> default?

>From the slapd.conf(5) man page:

if  no  loglevel  (or  a 0 level) is defined, no
          logging occurs, so at least the none level is  required
          to have high priority messages logged.



> Also the only options I set during the whole configure/make/make test
> sequence of the code was for the configure where I set these flags:
> --enable-bdb=no --enable-null=yes --enable-hdb=no ...So from this syslog
> should not have been disabled at build time either.

Given the above, I think whether or not the support for it exists doesn't 
really matter. ;)

> Thanks for continuing to follow up with this issue.

Certainly.  Still no luck reproducing it however.

--Quanah

--
Quanah Gibson-Mount
Product Engineer
Symas Corporation
Packaged, certified, and supported LDAP solutions powered by OpenLDAP:
<http://www.symas.com>


Comment 15 james.macnider@bridgewatersystems.com 2005-12-02 18:42:43 UTC
I'll try to capture this error in a debugger as Howard suggested.  I'm
surprised though that you were unable to reproduce it as yet.  We've
been able to do it here on several different machines ranging from a
dual CPU Ultra-60 with 1GB of RAM up to a 24 core E2900 with 48GB of
RAM.  Are you by any chance trying to reproduce this on a single CPU
(with a single core) machine?
 

-----Original Message-----
From: Quanah Gibson-Mount [mailto:quanah@symas.com] 
Sent: Friday, December 02, 2005 1:37 PM
To: openldap-its@OpenLDAP.org
Cc: James MacNider
Subject: RE: (ITS#4216) slapd terminates unexpectedly



--On Friday, December 02, 2005 1:30 PM -0500 James MacNider
<james.macnider@bridgewatersystems.com> wrote:

> Hello Quanah,
>
> As I mentioned earlier to Howard Chu, my slapd.conf simply contains 
> the
> following:
>
> -------------
> ucdata-path
> /dev/src/ldap/openldap/null/openldap-2.3.12/libraries/liblunicode
>
> backend null
> database null
> suffix   ""
> rootdn   "cn=root"
> bind     on
>
># Cleartext passwords, especially for the rootdn, should # be avoid.
> See slappasswd(8) and slapd.conf(5) for details.
># Use of strong authentication encouraged.
> rootpw    secret
> -------------
>
> So I never set a syslog level in slapd.conf.  What is that value by 
> default?

>From the slapd.conf(5) man page:

if  no  loglevel  (or  a 0 level) is defined, no
          logging occurs, so at least the none level is  required
          to have high priority messages logged.



> Also the only options I set during the whole configure/make/make test 
> sequence of the code was for the configure where I set these flags:
> --enable-bdb=no --enable-null=yes --enable-hdb=no ...So from this 
> syslog should not have been disabled at build time either.

Given the above, I think whether or not the support for it exists
doesn't really matter. ;)

> Thanks for continuing to follow up with this issue.

Certainly.  Still no luck reproducing it however.

--Quanah

--
Quanah Gibson-Mount
Product Engineer
Symas Corporation
Packaged, certified, and supported LDAP solutions powered by OpenLDAP:
<http://www.symas.com>



Comment 16 Quanah Gibson-Mount 2005-12-02 18:50:31 UTC

--On Friday, December 02, 2005 1:42 PM -0500 James MacNider 
<james.macnider@bridgewatersystems.com> wrote:

> I'll try to capture this error in a debugger as Howard suggested.  I'm
> surprised though that you were unable to reproduce it as yet.  We've
> been able to do it here on several different machines ranging from a
> dual CPU Ultra-60 with 1GB of RAM up to a 24 core E2900 with 48GB of
> RAM.  Are you by any chance trying to reproduce this on a single CPU
> (with a single core) machine?

No, I was using a dual CPU SunFire V210 with 2GB of memory.

What patchlevel is your Solaris 8 system on?

Mine is at 5.8 Generic_117350-04

--Quanah

--
Quanah Gibson-Mount
Product Engineer
Symas Corporation
Packaged, certified, and supported LDAP solutions powered by OpenLDAP:
<http://www.symas.com>


Comment 17 james.macnider@bridgewatersystems.com 2005-12-02 19:04:37 UTC
Our patch level varies from machine to machine.  Our oldest version is
117350-20, the most recent is 117350-29.


-----Original Message-----
From: Quanah Gibson-Mount [mailto:quanah@symas.com] 
Sent: Friday, December 02, 2005 1:51 PM
To: James MacNider; openldap-its@OpenLDAP.org
Subject: RE: (ITS#4216) slapd terminates unexpectedly



--On Friday, December 02, 2005 1:42 PM -0500 James MacNider
<james.macnider@bridgewatersystems.com> wrote:

> I'll try to capture this error in a debugger as Howard suggested.  I'm

> surprised though that you were unable to reproduce it as yet.  We've 
> been able to do it here on several different machines ranging from a 
> dual CPU Ultra-60 with 1GB of RAM up to a 24 core E2900 with 48GB of 
> RAM.  Are you by any chance trying to reproduce this on a single CPU 
> (with a single core) machine?

No, I was using a dual CPU SunFire V210 with 2GB of memory.

What patchlevel is your Solaris 8 system on?

Mine is at 5.8 Generic_117350-04

--Quanah

--
Quanah Gibson-Mount
Product Engineer
Symas Corporation
Packaged, certified, and supported LDAP solutions powered by OpenLDAP:
<http://www.symas.com>



Comment 18 Howard Chu 2005-12-16 05:35:46 UTC
changed notes
Comment 19 Kurt Zeilenga 2006-01-04 23:26:07 UTC
changed state Feedback to Closed
Comment 20 Howard Chu 2009-02-17 05:29:57 UTC
moved from Incoming to Archive.Incoming
Comment 21 OpenLDAP project 2014-08-01 21:05:57 UTC
cannot reproduce