[Date Prev][Date Next] [Chronological] [Thread] [Top]

Client aborts on assertions during high server load (ITS#3222)



Full_Name: Christian Hollstein
Version: 2.2.13
OS: LINUX SuSE 8.2, HPUX 11.11
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (217.229.126.92)


Hi!

We use OpenLDAP for a large client server project with approximately
20000 online users. Authentication data for the users is kept in LDAP.
When we put the application into operation for the first time, our
middletier components, which used the OpenLDAP client library
died with SIGABRT triggered by assertions.
We were able to reproduce the behaviour with a stripped down client.
As long as we don't run it in parallel, everything is fine. When several
clients access the LDAP at the same time, some of them die. From
that time on the remaining clients can do their transaction only with a
very reduced rate, while the LDAP Server runs at 100% CPU with
user and sys time each approximatly 50%. Finally the server terminates,
if it was compiled with debug support. The non - debug version recovers,
after the clients are gone.
The behaviour is the same with server and client running under HPUX 11.11
(Our production system) or both running under SuSE LINUX 8.2 (Development).
So far we did no cross - platform client - server tests.
Under LINUX the first assertions show up, when the server CPU reaches 100%.
On a HPUX RP5450 with two processors it can happen before that watermark is
reached. We watched the same behaviour with the version 2.1.22 / Berkely DB
4.1.
If I compile with --enable-debug=no in order to get rid of the assertions,
the clients receive improper data structures and die on SIGSEGV.
The assertions occur very often in ber_free() (LBER_VALID not fulfilled),
here are some examples:

zl_client: io.c:171: ber_free_buf: Assertion `((ber)->ber_opts.lbo_valid==0x2)'
failed.
zl_client: sockbuf.c:74: ber_sockbuf_free: Assertion `( (sb)->sb_opts.lbo_valid
== 0x3 )' failed.
zl_client: unbind.c:49: ldap_unbind_ext: Assertion `( (ld)->ld_options.ldo_valid
== 0x2 )' failed.

We appreciate very much your help!

Christian Hollstein at Deutsche Telekom