[Date Prev][Date Next] [Chronological] [Thread] [Top]

slapd with both a bdb/ldbm backend and a perl backend crash (ITS#2842)



Full_Name: Giulio Carabetta
Version: 2.1.22
OS: Linux SuSE 8.1
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (213.92.96.27)


Hi people! 

I'm having a problem. An ldap server with both a perl backend and a bdb backend
crash. I've also tryed with an ldbm instead of the bdb, but crashes more more
more frequently (Now we prefer to stay on the ldbm because when it crash don't
corrupt the database... :)) )

As a workaround, I've splitted on 2 slapd server the backends (on the same hw
obviously), and all works fine. The bdb listen on 389, and the perl backend
listen on 390.

This situation is the same on 5 server, 1 master and 4 slaves (slurpd docet).

The crash is not so... immediate, so with an high debug level, I don't have
enought space to trace the error, and with a low debug level.... there are no
info...

First of all, the environment.

OpenLdap 2.1.22 compiled as:
./configure  --prefix=/usr/ --enable-debug --enable-syslog --enable-monitor
--with-cyrus-sasl --enable-crypt --without-kerberos --with-tls --enable-ldbm
--with-ldbm-api=berkeley --enable-perl --disable-ipv6 --disable-slurpd

Other packages involved are:
Authen-SASL-2.03 (perl module)
Convert-ASN1-0.15 (perl module)
IO-Socket-SSL-0.92 (perl module)
Net_SSLeay.pm-1.22 (perl module)
TimeDate-1.16 (perl module)
URI-1.23 (perl module)
courier-imap-1.7.1.20030319
cyrus-sasl-2.1.13
db-4.1.24
openldap-2.1.22
pam_ldap-161
perl-5.8.0
perl-ldap-0.2701 (perl module)
sendmail-8.12.9 (final client that looks into both backends)

and under ou=people,dc=abi,dc=it we have 600 entry.

All this is for our mail system. Sendmail on the master server receive all mail
and forward it to the correct server looking infos in the user's attribute, or
in sendmail schema attribute.
The slaves make the same job, but have also locally the user's account
maildirectory, so imap or pop protocol make auth lookups on ldap (by pam_ldap).
If a mail came from a slave and go to another slave, the master is not
involved.

So, the ldap on master is not heavy loaded as slaves.
And, effectively, master crashes less than slaves.

Someone said me that may be there is a difference if the first query is made on
one backend insted the other, but I don't know: in production I cannot say if
the first is made on one or on another, and I didn't made no test in that
direction (may be logging...).


Just to understand the perl module. I need to explode some dynamic lists of
users. So, I define a dn that contains a list of dn (static lists) or an ldap
uri (dynamic lists: these can return list of dn that are also exploded). Someone
must explode this information, and the application (sendmail) cannot do that
byself.
So I wrote a Perl backend that do nothing else that an "intelligent" search on
the bdb backend. No insert, update or delete. Just queryes. 
It is a backend for the slapd, and also a client for the "traditional", data
container backend, on the same process.
If you need to explode a list, use ldapsearch -b dc=sendmailbe, if you don't
need it, use normal backend: ldapsearch -b dc=myorg,dc=it (speed improvement :)
). But if you made a normal search, results are equal. I've tryed to keep it
more trasparently as possible.
I know, it's easly not clear, but look to examples I've included :) I've also
included data to replicate my situation... I uploaded file named
carabetta-031127.tar.gz on ftp.openldap.org/incoming/

We made a lot of debugging on the perl script, and now we think at that module
as a rock....

Let me know if there is a different way to do this work, but we have production
system that uses this method, and it's ok (but the crash..).

To include some log I've replicated the error, and I've noticed that after a RAM
upgrade, the crashes are less frequent.
We started all things with 256Mb ram, and with the ldbm it crashes every 5
minutes. Now, with 1 Gb ram, it crashed just 1 time in about 6 hours (after that
I've re-splitted the 2 backends).

Tail of 20Gb of "minimal" log report that the perl ldap->bind cannot contact the
ldap server, eg on the port 389 there is nothing (but usually there are both
backends):

################################################# LOG
relay:/tmp # tail -20 1069064743-osc.log
=> id2entry_r( 2387 )
====> cache_find_entry_id( 2387 )
"sendmailMTAKey=dm,ou=SendmailMap,dc=abi,dc=it" (found) (1 tries)
<= id2entry_r( 2387 ) 0x8705848 (cache)
is_object_subclass(1.3.6.1.4.1.6152.10.3.2.12,2.5.6.0) 0
is_object_subclass(1.3.6.1.4.1.6152.10.3.2.12,1.3.6.1.4.1.6152.10.3.2.10) 0
is_object_subclass(1.3.6.1.4.1.6152.10.3.2.12,2.5.6.0) 0
is_object_subclass(1.3.6.1.4.1.6152.10.3.2.12,1.3.6.1.4.1.6152.10.3.2.11) 0
is_object_subclass(1.3.6.1.4.1.6152.10.3.2.12,1.3.6.1.4.1.6152.10.3.2.10) 0
is_object_subclass(1.3.6.1.4.1.6152.10.3.2.12,2.5.6.0) 0
is_object_subclass(1.3.6.1.4.1.6152.10.3.2.12,1.3.6.1.4.1.6152.10.3.2.12) 1
ldbm_search: candidate entry 2387 does not match filter
====> cache_return_entry_r( 2387 ): returned (0)
Can't call method "bind" on an undefined value at
/usr/etc/openldap/perl/SendmailBE1.pm line 110.
connection_get(28): got connid=7074
connection_read(28): checking for input on id=7074
ber_get_next
ber_get_next on fd 28 failed errno=0 (Success)
connection_read(28): input error=-2 id=7074, closing.
connection_closing: readying conn=7074 sd=28 for close
connection_close: conn=7074 sd=28
################################################# END LOG


Let me know: if you need more log details, I can try to downgrade server's ram
back to 256Mb to see the crashes in few log's lines...

Bye!

Giulio