Full_Name: James MacNider Version: 2.1.6 and 2.2.29 OS: Solaris 8 URL: Submission from: (NULL) (216.113.7.2) I think this may be the same issue as was previously reported in ITS# 3567 which was previously closed. I have been seen the slapd server exit (gracefully, no core) under high load when using the NULL backend that is included as part of the OpenLDAP distribution. I've been able to reproduce this almost at will with versions 2.1.6 and 2.2.29 of the OpenLDAP slapd process. I'm using SLAMD (www.slamd.org) to generate the load. Tracing through the code it again seems to be that the problem is getting an ENOSYS code back from the call to select(), as in ITS# 3567. Because I'm using the null backend I can't see how this defect could be in the backend code as suggested in ITS# 3567. It is of course possible that this is a defect in Solaris and not OpenLDAP, but that has yet to be proven one way or the other.
Note: www.slamd.org should read www.slamd.com -----Original Message----- From: openldap-its@OpenLDAP.org [mailto:openldap-its@OpenLDAP.org] Sent: Friday, November 25, 2005 10:44 AM To: James MacNider Subject: Re: (ITS#4216) slapd terminates unexpectedly *** THIS IS AN AUTOMATICALLY GENERATED REPLY *** Thanks for your report to the OpenLDAP Issue Tracking System. Your report has been assigned the tracking number ITS#4216. One of our support engineers will look at your report in due course. Note that this may take some time because our support engineers are volunteers. They only work on OpenLDAP when they have spare time. If you need to provide additional information in regards to your issue report, you may do so by replying to this message. Note that any mail sent to openldap-its@openldap.org with (ITS#4216) in the subject will automatically be attached to the issue report. mailto:openldap-its@openldap.org?subject=(ITS#4216) You may follow the progress of this report by loading the following URL in a web browser: http://www.OpenLDAP.org/its/index.cgi?findid=4216 Please remember to retain your issue tracking number (ITS#4216) on any further messages you send to us regarding this report. If you don't then you'll just waste our time and yours because we won't be able to properly track the report. Please note that the Issue Tracking System is not intended to be used to seek help in the proper use of OpenLDAP Software. Such requests will be closed. OpenLDAP Software is user supported. http://www.OpenLDAP.org/support/ -------------- Copyright 1998-2005 The OpenLDAP Foundation, All Rights Reserved.
It's also worth noting that the following OS settings were in place when the exit issue occurs: ndd -set /dev/tcp tcp_time_wait_interval 240000 ndd -set /dev/tcp tcp_conn_req_max_q 128 ndd -set /dev/tcp tcp_keepalive_interval 7200000 ndd -set /dev/tcp tcp_deferred_ack_interval 100 ndd -set /dev/tcp tcp_smallest_anon_port 32768 ulimit -n 256 However when the OS is tuned more aggressively (as below) I've been unable to reproduce the problem: ndd -set /dev/tcp tcp_time_wait_interval 30000 ndd -set /dev/tcp tcp_conn_req_max_q 1024 ndd -set /dev/tcp tcp_keepalive_interval 600000 ndd -set /dev/tcp tcp_deferred_ack_interval 5 ndd -set /dev/tcp tcp_smallest_anon_port 8192 ulimit -n 4096
The bug reported in #3567 was clearly due to faulty (perl) application code. As such, the bug you're reporting here seems to have a different cause. Please try reproducing it in 2.3.12. I'll note that we at Symas have subjected 2.3 slapd to tremendous loads using slamd, and have not seen any such failure, but we were running under Linux. james.macnider@bridgewatersystems.com wrote: > Full_Name: James MacNider > Version: 2.1.6 and 2.2.29 > OS: Solaris 8 > URL: > Submission from: (NULL) (216.113.7.2) > > > I think this may be the same issue as was previously reported in ITS# 3567 which > was previously closed. > > I have been seen the slapd server exit (gracefully, no core) under high load > when using the NULL backend that is included as part of the OpenLDAP > distribution. I've been able to reproduce this almost at will with versions > 2.1.6 and 2.2.29 of the OpenLDAP slapd process. I'm using SLAMD (www.slamd.org) > to generate the load. > > Tracing through the code it again seems to be that the problem is getting an > ENOSYS code back from the call to select(), as in ITS# 3567. Because I'm using > the null backend I can't see how this defect could be in the backend code as > suggested in ITS# 3567. It is of course possible that this is a defect in > Solaris and not OpenLDAP, but that has yet to be proven one way or the other. > > > -- -- Howard Chu Chief Architect, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc OpenLDAP Core Team http://www.openldap.org/project/
We were able to reproduce the defect with version 2.3.12 on the first try. The server, as before, is configured to use the null backend. The queries were again driven by SLAMD using the SearchClientJobClass. Our search filter simply contained: (objectclass=*). My slapd.conf is below: ----------------------- ucdata-path /dev/src/ldap/openldap/null/openldap-2.3.12/libraries/liblunicode backend null database null suffix "" rootdn "cn=root" bind on # Cleartext passwords, especially for the rootdn, should # be avoid. See slappasswd(8) and slapd.conf(5) for details. # Use of strong authentication encouraged. rootpw secret -----------------------
James MacNider wrote: > We used 10 clients and 10 threads per client spread across 2 test driver > machines (so a total of 50 threads on each machine). The time it takes > for the problem to occur varies...it has happened as quickly as 10 > seconds or as long as 20 minutes. > Just as an aside, in our experience using more than two threads per client doesn't actually produce any more useful server load. By the way, is syslog active on slapd while running this test? Is slapd configured with reverse-lookups enabled? The point is to identify where any non-socket descriptors are coming from, if any. If you can run slapd under a debugger with a breakpoint at the point where the error is detected (daemon.c line 1862 would be a good choice), and then run lsof on the slapd process when the breakpoint is triggered, that may give some useful information. Also print slap_daemon, readfds, and writefds at that point. -- -- Howard Chu Chief Architect, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc OpenLDAP Core Team http://www.openldap.org/project/
James MacNider wrote: > We used 10 clients and 10 threads per client spread across 2 test driver > machines (so a total of 50 threads on each machine). The time it takes > for the problem to occur varies...it has happened as quickly as 10 > seconds or as long as 20 minutes. > We've been running the same test here and see no error. -- -- Howard Chu Chief Architect, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc OpenLDAP Core Team http://www.openldap.org/project/
The problem may well be specific to Solaris then (I recall you said that your tests are running under Linux). My "configure" was as follows: configure --enable-bdb=no --enable-null=yes --enable-hdb=no Every other item was left as the default value. I did capture the configure/make/make depend/make test output to file. If you're interested in any of that just let me know and I'll send it on to you. I did have one question for you. You asked me if was syslog active on slapd while running the tests. I'm not quite sure what you meant by that... syslogd is running on the box, but I don't think that's quite what you were getting at. I'll do my best to gather the other information you've requested. -----Original Message----- From: Howard Chu [mailto:hyc@symas.com] Sent: Monday, November 28, 2005 10:45 PM To: James MacNider Cc: openldap-its@openldap.org Subject: Re: (ITS#4216) slapd terminates unexpectedly James MacNider wrote: > We used 10 clients and 10 threads per client spread across 2 test > driver machines (so a total of 50 threads on each machine). The time > it takes for the problem to occur varies...it has happened as quickly > as 10 seconds or as long as 20 minutes. > We've been running the same test here and see no error. -- -- Howard Chu Chief Architect, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc OpenLDAP Core Team http://www.openldap.org/project/
--On Tuesday, November 29, 2005 1:33 PM +0000 james.macnider@bridgewatersystems.com wrote: > The problem may well be specific to Solaris then (I recall you said that > your tests are running under Linux). > > My "configure" was as follows: > > configure --enable-bdb=no --enable-null=yes --enable-hdb=no > > Every other item was left as the default value. I did capture the > configure/make/make depend/make test output to file. If you're > interested in any of that just let me know and I'll send it on to you. > > I did have one question for you. You asked me if was syslog active on > slapd while running the tests. I'm not quite sure what you meant by > that... syslogd is running on the box, but I don't think that's quite > what you were getting at. > > I'll do my best to gather the other information you've requested. Actually, we were running it on a Solaris 8 system, with 256 file descriptors, with only the back-null database loaded, and the ndd parameters you specified in your post. I eventually ramped it up to where I had 100 clients connecting, and let that run for an hour, which completed without problem. Regards, Quanah -- Quanah Gibson-Mount Product Engineer Symas Corporation Packaged, certified, and supported LDAP solutions powered by OpenLDAP: <http://www.symas.com>
James MacNider wrote: > The problem may well be specific to Solaris then (I recall you said that > your tests are running under Linux). > (As Quanah already said, we reran this test on a Solaris 8 system.) > My "configure" was as follows: > > configure --enable-bdb=no --enable-null=yes --enable-hdb=no > > Every other item was left as the default value. I did capture the > configure/make/make depend/make test output to file. If you're > interested in any of that just let me know and I'll send it on to you. > > I did have one question for you. You asked me if was syslog active on > slapd while running the tests. I'm not quite sure what you meant by > that... syslogd is running on the box, but I don't think that's quite > what you were getting at. > The question is whether you have slapd set to log any messages to syslog, since that would introduce another file descriptor into the process. -- -- Howard Chu Chief Architect, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc OpenLDAP Core Team http://www.openldap.org/project/
changed state Open to Feedback
The only argument passed to slapd on the command line is the -f option to specify the config file. All other options were left at the default values. -----Original Message----- From: Howard Chu [mailto:hyc@symas.com] Sent: Wednesday, November 30, 2005 4:36 PM To: James MacNider Cc: openldap-its@openldap.org Subject: Re: (ITS#4216) slapd terminates unexpectedly James MacNider wrote: > The problem may well be specific to Solaris then (I recall you said > that your tests are running under Linux). > (As Quanah already said, we reran this test on a Solaris 8 system.) > My "configure" was as follows: > > configure --enable-bdb=no --enable-null=yes --enable-hdb=no > > Every other item was left as the default value. I did capture the > configure/make/make depend/make test output to file. If you're > interested in any of that just let me know and I'll send it on to you. > > I did have one question for you. You asked me if was syslog active on > slapd while running the tests. I'm not quite sure what you meant by > that... syslogd is running on the box, but I don't think that's quite > what you were getting at. > The question is whether you have slapd set to log any messages to syslog, since that would introduce another file descriptor into the process. -- -- Howard Chu Chief Architect, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc OpenLDAP Core Team http://www.openldap.org/project/
--On Thursday, December 01, 2005 6:47 PM +0000 james.macnider@bridgewatersystems.com wrote: > The only argument passed to slapd on the command line is the -f option > to specify the config file. All other options were left at the default > values. I'm not sure how this relates the question about whether or not you are logging to syslog. What is your loglevel setting in slapd.conf? Did you explicitly disable syslog logging if you built the code from source? --Quanah -- Quanah Gibson-Mount Product Engineer Symas Corporation Packaged, certified, and supported LDAP solutions powered by OpenLDAP: <http://www.symas.com>
Hello Quanah, As I mentioned earlier to Howard Chu, my slapd.conf simply contains the following: ------------- ucdata-path /dev/src/ldap/openldap/null/openldap-2.3.12/libraries/liblunicode backend null database null suffix "" rootdn "cn=root" bind on # Cleartext passwords, especially for the rootdn, should # be avoid. See slappasswd(8) and slapd.conf(5) for details. # Use of strong authentication encouraged. rootpw secret ------------- So I never set a syslog level in slapd.conf. What is that value by default? Also the only options I set during the whole configure/make/make test sequence of the code was for the configure where I set these flags: --enable-bdb=no --enable-null=yes --enable-hdb=no ...So from this syslog should not have been disabled at build time either. Thanks for continuing to follow up with this issue. -----Original Message----- From: Quanah Gibson-Mount [mailto:quanah@symas.com] Sent: Friday, December 02, 2005 1:23 PM To: James MacNider; openldap-its@OpenLDAP.org Subject: RE: (ITS#4216) slapd terminates unexpectedly --On Thursday, December 01, 2005 6:47 PM +0000 james.macnider@bridgewatersystems.com wrote: > The only argument passed to slapd on the command line is the -f option > to specify the config file. All other options were left at the > default values. I'm not sure how this relates the question about whether or not you are logging to syslog. What is your loglevel setting in slapd.conf? Did you explicitly disable syslog logging if you built the code from source? --Quanah -- Quanah Gibson-Mount Product Engineer Symas Corporation Packaged, certified, and supported LDAP solutions powered by OpenLDAP: <http://www.symas.com>
--On Friday, December 02, 2005 1:30 PM -0500 James MacNider <james.macnider@bridgewatersystems.com> wrote: > Hello Quanah, > > As I mentioned earlier to Howard Chu, my slapd.conf simply contains the > following: > > ------------- > ucdata-path > /dev/src/ldap/openldap/null/openldap-2.3.12/libraries/liblunicode > > backend null > database null > suffix "" > rootdn "cn=root" > bind on > ># Cleartext passwords, especially for the rootdn, should # be avoid. > See slappasswd(8) and slapd.conf(5) for details. ># Use of strong authentication encouraged. > rootpw secret > ------------- > > So I never set a syslog level in slapd.conf. What is that value by > default? >From the slapd.conf(5) man page: if no loglevel (or a 0 level) is defined, no logging occurs, so at least the none level is required to have high priority messages logged. > Also the only options I set during the whole configure/make/make test > sequence of the code was for the configure where I set these flags: > --enable-bdb=no --enable-null=yes --enable-hdb=no ...So from this syslog > should not have been disabled at build time either. Given the above, I think whether or not the support for it exists doesn't really matter. ;) > Thanks for continuing to follow up with this issue. Certainly. Still no luck reproducing it however. --Quanah -- Quanah Gibson-Mount Product Engineer Symas Corporation Packaged, certified, and supported LDAP solutions powered by OpenLDAP: <http://www.symas.com>
I'll try to capture this error in a debugger as Howard suggested. I'm surprised though that you were unable to reproduce it as yet. We've been able to do it here on several different machines ranging from a dual CPU Ultra-60 with 1GB of RAM up to a 24 core E2900 with 48GB of RAM. Are you by any chance trying to reproduce this on a single CPU (with a single core) machine? -----Original Message----- From: Quanah Gibson-Mount [mailto:quanah@symas.com] Sent: Friday, December 02, 2005 1:37 PM To: openldap-its@OpenLDAP.org Cc: James MacNider Subject: RE: (ITS#4216) slapd terminates unexpectedly --On Friday, December 02, 2005 1:30 PM -0500 James MacNider <james.macnider@bridgewatersystems.com> wrote: > Hello Quanah, > > As I mentioned earlier to Howard Chu, my slapd.conf simply contains > the > following: > > ------------- > ucdata-path > /dev/src/ldap/openldap/null/openldap-2.3.12/libraries/liblunicode > > backend null > database null > suffix "" > rootdn "cn=root" > bind on > ># Cleartext passwords, especially for the rootdn, should # be avoid. > See slappasswd(8) and slapd.conf(5) for details. ># Use of strong authentication encouraged. > rootpw secret > ------------- > > So I never set a syslog level in slapd.conf. What is that value by > default? >From the slapd.conf(5) man page: if no loglevel (or a 0 level) is defined, no logging occurs, so at least the none level is required to have high priority messages logged. > Also the only options I set during the whole configure/make/make test > sequence of the code was for the configure where I set these flags: > --enable-bdb=no --enable-null=yes --enable-hdb=no ...So from this > syslog should not have been disabled at build time either. Given the above, I think whether or not the support for it exists doesn't really matter. ;) > Thanks for continuing to follow up with this issue. Certainly. Still no luck reproducing it however. --Quanah -- Quanah Gibson-Mount Product Engineer Symas Corporation Packaged, certified, and supported LDAP solutions powered by OpenLDAP: <http://www.symas.com>
--On Friday, December 02, 2005 1:42 PM -0500 James MacNider <james.macnider@bridgewatersystems.com> wrote: > I'll try to capture this error in a debugger as Howard suggested. I'm > surprised though that you were unable to reproduce it as yet. We've > been able to do it here on several different machines ranging from a > dual CPU Ultra-60 with 1GB of RAM up to a 24 core E2900 with 48GB of > RAM. Are you by any chance trying to reproduce this on a single CPU > (with a single core) machine? No, I was using a dual CPU SunFire V210 with 2GB of memory. What patchlevel is your Solaris 8 system on? Mine is at 5.8 Generic_117350-04 --Quanah -- Quanah Gibson-Mount Product Engineer Symas Corporation Packaged, certified, and supported LDAP solutions powered by OpenLDAP: <http://www.symas.com>
Our patch level varies from machine to machine. Our oldest version is 117350-20, the most recent is 117350-29. -----Original Message----- From: Quanah Gibson-Mount [mailto:quanah@symas.com] Sent: Friday, December 02, 2005 1:51 PM To: James MacNider; openldap-its@OpenLDAP.org Subject: RE: (ITS#4216) slapd terminates unexpectedly --On Friday, December 02, 2005 1:42 PM -0500 James MacNider <james.macnider@bridgewatersystems.com> wrote: > I'll try to capture this error in a debugger as Howard suggested. I'm > surprised though that you were unable to reproduce it as yet. We've > been able to do it here on several different machines ranging from a > dual CPU Ultra-60 with 1GB of RAM up to a 24 core E2900 with 48GB of > RAM. Are you by any chance trying to reproduce this on a single CPU > (with a single core) machine? No, I was using a dual CPU SunFire V210 with 2GB of memory. What patchlevel is your Solaris 8 system on? Mine is at 5.8 Generic_117350-04 --Quanah -- Quanah Gibson-Mount Product Engineer Symas Corporation Packaged, certified, and supported LDAP solutions powered by OpenLDAP: <http://www.symas.com>
changed notes
changed state Feedback to Closed
moved from Incoming to Archive.Incoming
cannot reproduce