[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Solaris 32bit File Descriptor Limit (ITS#2636)



Joseph,

We're using 32-bit, BDB and Solaris 8, compiled with 3.2.2. Our database
has only 600000 records, but it's not going fast at all. On a 4x300MHz
E450 restoring the database with slapadd takes about 430 minutes.

The 64 bit version you mention performs well for 3.2 million records?
Wow... I'll try a 64-bit build too.

Note: The CPU appears to be the bottleneck, disks are only doing 1MB/sec
during slapadd, and this is on a SAN that can sustain 40MB/sec line speed.

cheers,
maarten

On Sun, 20 Jul 2003 joseph.tingiris@sdc.cox.net wrote:

> Ok, it's not a bug with OpenLDAP.  Sorry, I just used this thread to track
> the problem ... maybe someone else will find this useful in the archives
> some day.  The problem I ran into was because I was launching the 64bit
> slapd from a 32bit bash shell..  The workaround was pretty simple, I just
> changed the shell from Solaris /bin/bash to a 64bit compiled
> /usr/local/bin/bash (GNU) and slapd started working as expected when I
> started it from there.  Subsequently, I tested it with Solaris' /bin/sh and
> that worked, too.  That's where I left it, using sh.
>
> I assume it's a bug in Solaris' /bin/bash and Solaris was enforcing the
> 32bit limit of 256 NOFILES descriptors to all children.   I still don't
> really understand why plimit said the file descriptors were unlimited, when
> they weren't, but this seems like an issue for Sun to deal with, now.
>
> Oh, and by the way, all of my tests with Solaris 64bit OpenLDAP compiled
> with gcc 3.2.2 (--enable-bdb --enable-ldbm) have been exceptional.  The
> database backends both performed excellent with over 3.2 million DNs loaded
> in many configurations, including subordinates.  I will be exposing a few
> test servers to a large (real) user environment next week and expect it to
> behave much better than the 32bit version under the same connection load.
>
> ----- Original Message -----
> From: "Joseph Tingiris" <joseph.tingiris@sdc.cox.net>
> To: <joseph.tingiris@sdc.cox.net>; <openldap-its@OpenLDAP.org>
> Sent: Saturday, July 19, 2003 5:16 PM
> Subject: Re: Solaris 32bit File Descriptor Limit (ITS#2636)
>
>
> > Ok, I've been doing some testing and have a question I hope someone can
> > answer.  I've compiled 2.1.22 64bit for Solaris expecting to solve this
> 256
> > simulataneous connection issue I'm having, but something's still not
> right.
> > After loading the databases, I re-run the same test on the 64bit binary
> and
> > when I get 256 descriptors open, I get almost the exact same log message,
> > but now it logs at 266 instead of 256:
> >
> > slapd[28075]: [ID 759906 local6.debug] daemon: 266 beyond descriptor table
> > size 256
> >
> > The ldapsearh client produces this:
> >
> > ldap_simple_bind_s: Can't contact LDAP server
> >
> > It seems that the other threads block all connections over 256 until there
> > is a free descriptor <256 at which point the query is returned.  Is this
> > right?  Why?  Is there a limit in the code or something?
> >
> > I've done a lot of testing and checked the process with plimit and can see
> > it is not an OS barrier (unlimited).  I don't really see a need to debug,
> > yet, because it's not coring or anything like that.  But, I'm sure the
> > search will eventually abort and/or time out (tested) if I don't free a
> > descriptor <256 so I think this is a bug or something I'm just completely
> > missing.  Threads are set to 32 and cachesize, dbcachesize, etc are all
> > default.  I'm using back-ldbm and back-glue because my tree is really
> large
> > and I needed to split the db up (and these are replicas).  Here is an
> output
> > of the pmap and for reference, none of the libraries are 32bit:
> >
> > 28075:  /usr/local/openldap64/libexec/slapd -f
> > /openldap/backup/replica64/etc/
> > 0000000100000000   2608K read/exec
> > /usr/local/openldap64/libexec/slapd
> > 000000010038A000     32K read/write/exec
> > /usr/local/openldap64/libexec/slapd
> > 0000000100392000  22760K read/write/exec     [ heap ]
> > FFFFFFFF7B3FC000     16K read/write          [ anon ]
> > FFFFFFFF7BBFC000     16K read/write          [ anon ]
> > FFFFFFFF7D300000     16K read/write          [ anon ]
> > FFFFFFFF7D7F8000     32K read/write          [ anon ]
> > FFFFFFFF7D900000    112K read/write          [ anon ]
> > FFFFFFFF7DA00000     16K read/write          [ anon ]
> > FFFFFFFF7DB00000     16K read/write          [ anon ]
> > FFFFFFFF7DC00000      8K read/write/exec     [ anon ]
> > FFFFFFFF7DD00000     96K read/exec
> > /usr/lib/lwp/sparcv9/libthread.so.1
> > FFFFFFFF7DE18000     16K read/write/exec
> > /usr/lib/lwp/sparcv9/libthread.so.1
> > FFFFFFFF7DF00000      8K read/write/exec/shared   [ anon ]
> > FFFFFFFF7E000000     16K read/exec         /usr/lib/sparcv9/libmp.so.2
> > FFFFFFFF7E104000      8K read/write/exec   /usr/lib/sparcv9/libmp.so.2
> > FFFFFFFF7E200000      8K read/write/exec     [ anon ]
> > FFFFFFFF7E300000    720K read/exec         /usr/lib/sparcv9/libc.so.1
> > FFFFFFFF7E4B4000     56K read/write/exec   /usr/lib/sparcv9/libc.so.1
> > FFFFFFFF7E4C2000      8K read/write/exec   /usr/lib/sparcv9/libc.so.1
> > FFFFFFFF7E500000     24K read/exec
> /usr/lib/sparcv9/libpthread.so.1
> > FFFFFFFF7E606000      8K read/write/exec
> /usr/lib/sparcv9/libpthread.so.1
> > FFFFFFFF7E700000      8K read/exec
> /usr/lib/sparcv9/libmtmalloc.so.1
> > FFFFFFFF7E802000      8K read/write/exec
> /usr/lib/sparcv9/libmtmalloc.so.1
> > FFFFFFFF7E900000      8K read/exec
> > /usr/platform/sun4u-us3/lib/sparcv9/libc_psr.so.1
> > FFFFFFFF7EA00000     56K read/exec         /usr/lib/sparcv9/libsocket.so.1
> > FFFFFFFF7EB0E000     16K read/write/exec   /usr/lib/sparcv9/libsocket.so.1
> > FFFFFFFF7EC00000      8K read/write/exec     [ anon ]
> > FFFFFFFF7ED00000    664K read/exec         /usr/lib/sparcv9/libnsl.so.1
> > FFFFFFFF7EEA6000     56K read/write/exec   /usr/lib/sparcv9/libnsl.so.1
> > FFFFFFFF7EEB4000     40K read/write/exec   /usr/lib/sparcv9/libnsl.so.1
> > FFFFFFFF7EF00000     32K read/exec         /usr/lib/sparcv9/libgen.so.1
> > FFFFFFFF7F008000      8K read/write/exec   /usr/lib/sparcv9/libgen.so.1
> > FFFFFFFF7F100000    232K read/exec         /usr/lib/sparcv9/libresolv.so.2
> > FFFFFFFF7F23A000     32K read/write/exec   /usr/lib/sparcv9/libresolv.so.2
> > FFFFFFFF7F300000      8K read/exec         /usr/lib/sparcv9/libdl.so.1
> > FFFFFFFF7F400000      8K read/shared       dev:238,6 ino:493389
> > FFFFFFFF7F500000      8K read/write/exec     [ anon ]
> > FFFFFFFF7F600000    144K read/exec         /usr/lib/sparcv9/ld.so.1
> > FFFFFFFF7F722000     16K read/write/exec   /usr/lib/sparcv9/ld.so.1
> > FFFFFFFF7FFFA000     24K read/write          [ stack ]
> >          total    27976K
> >
> > ----- Original Message -----
> > From: <joseph.tingiris@sdc.cox.net>
> > To: <openldap-its@OpenLDAP.org>
> > Sent: Tuesday, July 08, 2003 2:13 PM
> > Subject: Re: Solaris 32bit File Descriptor Limit (ITS#2636)
> >
> >
> > > Some more information.
> > >
> > > I can reproduce the problem fairly easily in my lab.  I just telnet to
> the
> > > port (255 times), background the processes, and then try a real ldap
> > > operation (ldapsearch, modify, etc) and I get LDAP_SERVER_DOWN returned.
> > > During this time, lsof or pfiles shows 256 descriptors in use.  I've
> tried
> > > ulimit, plimit, and all of the /etc/system settings to no avail.
> > >
> > > I found this thread
> > > (http://www.openldap.org/lists/openldap-software/200111/msg00324.html)
> > > regarding OpenLDAP 2.0.11 and it appears to be the same issue.  I found
> > the
> > > post about fsio intriguing and I may attempt to compile the 32bit
> version
> > > with these libraries instead of Sun's.  Has anyone tried this, since?
> > >
> > > Sun's Response:
> > >
> > > "For a 32-bit application, a stdio library FILE structure represents the
> > > underlying file descriptor as an unsigned char, limiting the range of
> fds
> > > which can be opened as FILE's to 0-255 inclusive. A common, known
> problem
> > is
> > > that when the 32-bit stdio is used for a large server application, the
> 255
> > > limit is frequently exceeded. Although this limitation does not exist
> for
> > > 64-bit applications, this problem will always remain for 32-bit
> > > applications."
> > >
> > > Sun also suggests using open() instead of fopen() ... would a fix of
> this
> > > nature be possible for OpenLDAP?
> > >
> > > I've got a valid OpenLDAP 2.1.16 64bit build working and I'm testing it,
> > > now.  I expect this will be my final resolution, but until then I will
> > > reserve further comments and settle for yet another question.  Who else
> is
> > > using 64bit OpenLDAP on Solaris 2.8 and what are your experiences,
> > good/bad?
> > >
> > > Thanks!
> > >
> > > Joseph
> > >
> > > ----- Original Message -----
> > > From: <joseph.tingiris@cox.net>
> > > To: <openldap-its@OpenLDAP.org>
> > > Sent: Tuesday, July 08, 2003 11:32 AM
> > > Subject: Solaris 32bit File Descriptor Limit (ITS#2636)
> > >
> > >
> > > > Full_Name: Joseph Tingiris
> > > > Version: 2.1.16
> > > > OS: Solaris 8
> > > > URL: ftp://ftp.openldap.org/incoming/
> > > > Submission from: (NULL) (206.157.230.254)
> > > >
> > > >
> > > > Under high load, I get the following log message:
> > > >
> > > > Jul  6 23:49:34 lakeldap01 slapd[10327]: [ID 759906 local5.debug]
> > daemon:
> > > 256
> > > > beyond descriptor table size 256
> > > >
> > > > I've experienced this with Apache and understand that it is a 32bit
> > limit
> > > on
> > > > descriptors.  Again, 32bit Apache has a workaround for this that
> allows
> > > the
> > > > connection table to use non-file type descriptors.
> > > >
> > > > Does anyone know of a simple workaround for this, other than going to
> a
> > > 64bit
> > > > compiled version?
> > > >
> > > > Thanks!
> > > >
> > > >
> >
>