[Date Prev][Date Next] [Chronological] [Thread] [Top]

Repeated Search+Modify fails after 1024 cycles, slapd seg faults. (ITS#1447)



Full_Name: Simon Annetts
Version: 2.0.18
OS: RedHat 7.1
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (195.92.168.169)


There seems to be a problem with slapd when doing repeated search and modify
operations. After 1024 search and replaces, where the search returns a dn which
is then used as the key to a modify operation, further searches fail to produce
any results until about the 2046th search.
We have managed to encapsulate the problem in two test scripts, one written in
bash using ldapsearch and friends the other in perl using NET::LDAP
After creating a parent record, both scripts populate an empty database with
2500 records using the posixAccount object class required fields like this:

dn: uid=test${c},dc=worcs,dc=sch,dc=uk
objectclass: posixAccount
cn: test${c}
uid: test${c}
uidnumber: $((c+500))
gidnumber: 100
homedirectory: /home/test${c}

where c is the record number. No problems here.
Then the scripts attempt to modify each record by first doing a lookup for the
dn using a search filter like this: (&(objectclass=posixaccount)(uid=test${c}))
and then a modify on the gidnumber attribute using the returned dn as the key.
After 1023 times the searches start to fail. This is the 1024th record. (Is this
number significant somehow??). Searches will continue to fail to return any
results (and so we can't modify) until the 2045th record whereupon they start to
work again and the modifys continue up to record 2500. Finally we re-read the
record dn's again using 2500 queries like the one above. Here we see the
searches fail at 1024 again, and finally (but not always) slapd core dumps when
searching on record 2045 (sometimes the searches succeed from 2045-2500, but
mostly we get a core dump).

The only difference in the scripts is that the perl script (script 2) uses its
own ldap library and one connection only whereas the bash script uses the
command line tools so uses the openldap client library and many connections (one
connection for each operation). Both scripts produce the same results. I think
therefore that the problem is in slapd not in the client library or anything to
do with concurrent number of connections to the server.
(BTW is it normal to see >2000 connections stack up for a few minutes using
netstat when the bash script runs??)
The bash Script 1 has an extra bit in the top which you may want to remove... it
removes any old database in /etc/openldap/gdbm before starting the test.

I have tried this test with and without cacheing enabled and the results are the
same.
We have tried the test on three machines using RH7.1 on Intel and Athlon and on
openldap 2.0.17 and 2.0.18.
Two machines had builds of openldap which were done with a configure like this:

 --enable-ldbm --enable-passwd --with-ldbm-api=gdbm --enable-shell
--enable-local --enable-cldap --disable-rlookups --with-tls
 --with-cyrus-sasl --enable-wrappers --enable-cleartext --enable-crypt
--enable-spasswd --libexecdir=%{_sbindir} --localstatedir=/%{_var}/run'
 --with-slapd --without-slurpd --without-ldapd --without-threads --enable-shared
--enable-static
 
The client was build using the same options but --with-threads, but as I said I
don't suspect the client.
The third machine had a threaded 2.0.17 slapd running, with the same results.

Here is the slapd.conf we are using:

# $OpenLDAP: pkg/ldap/servers/slapd/slapd.conf,v 1.8.8.4 2000/08/26 17:06:18
kurt Exp $
#
# See slapd.conf(5) for details on configuration options.
# This file should NOT be world readable.

include         /etc/openldap/schema/core.schema
include         /etc/openldap/schema/cosine.schema
include         /etc/openldap/schema/inetorgperson.schema
include         /etc/openldap/schema/nis.schema
include         /etc/openldap/schema/redhat/rfc822-MailMember.schema
include         /etc/openldap/schema/redhat/autofs.schema
include         /etc/openldap/schema/redhat/kerberosobject.schema

# Define global ACLs to disable default read access.

# Do not enable referrals until AFTER you have a working directory
# service AND an understanding of referrals.
#referral       ldap://root.openldap.org

#pidfile        /var/run/slapd.pid
#argsfile       /var/run/slapd.args

# Load dynamic backend modules:
# modulepath    /usr/sbin/openldap
# moduleload    back_ldap.la
# moduleload    back_ldbm.la
# moduleload    back_passwd.la
# moduleload    back_shell.la

# To allow TLS-enabled connections, create /usr/share/ssl/certs/slapd.pem
# and uncomment the following lines.
# TLSCertificateFile /usr/share/ssl/certs/slapd.pem
# TLSCertificateKeyFile /usr/share/ssl/certs/slapd.pem

#######################################################################
# ldbm database definitions
#######################################################################


defaultaccess read
defaultsearchbase dc=worcs,dc=sch,dc=uk

database        ldbm
suffix          "dc=worcs,dc=sch,dc=uk"

rootdn          "cn=root,dc=worcs,dc=sch,dc=uk"

# Use of strong authentication encouraged.
rootpw secret

# The database directory MUST exist prior to running slapd AND
# should only be accessable by the slapd/tools. Mode 700 recommended.
directory       /etc/openldap/gdbm

# Indices to maintain
index   objectClass,uid,uidNumber,gidNumber     eq
index   cn,mail,surname,givenname               eq,subinitial

# access control added 12/09/01 simon@ateb.co.uk

# order *is* important here and works from top to bottom, i.e. top rules are
matched first

access to attrs=entry,ou,cn,description,gid,uid,givenName,initials,sn,display-name,mail,role-description,multilineDescription,title,department,manager,reports,c,o,postalAddress,street,l,st,co,postalCode,URL,telephoneNumber,facsimileTelephoneNumber,physicalDeliveryOfficeName,OfficeFax
        by self write
        by * read

access to attrs=userPassword,lmpassword,ntpassword,atebmailalias,atebmailforward,atebmailvacationmsg,atebmailvacation
        by self write
        by anonymous auth

access to attrs=shadowlastchange,shadowmax,shadowwarning,loginshell,uidNumber,gidNumber,ntuid,rid,grouprid,homedirectory,pwdlastset,pwdcanchange,pwdmustchange,smbhome,homedrive,script,profile,acctflags,logontime,logofftime,kickofftime,atebS0,atebS1,atebS2,atebS3,atebS4,atebS5,atebS6,atebS7,atebS8,atebS9,atebS10,atebS11,atebS12,atebS13,atebS14,atebS15,atebS16,atebS17,atebS18,atebS19,atebDall
        by self read

cachesize 5000
dbcachesize 1000000

#
# End of slapd.conf

Here are the two scripts which break the server :-)

#!/bin/bash

#we'll work with this object class just for ease
#objectclass ( 1.3.6.1.1.1.2.0 NAME 'posixAccount' SUP top AUXILIARY
#        DESC 'Abstraction of an account with POSIX attributes'
#        MUST ( cn $ uid $ uidNumber $ gidNumber $ homeDirectory )
#        MAY ( userPassword $ loginShell $ gecos $ description ) )

# bind dn and password please

BINDDN="cn=root,dc=worcs,dc=sch,dc=uk"
BINDPW="secret"
SEARCHBASE="dc=worcs,dc=sch,dc=uk"

service ldap stop
mkdir /etc/openldap/gdbm 2>/dev/null
rm -f /etc/openldap/gdbm/*.gdbm
slapd -u ldap &

# create the parent

cat <<EOF | ldapmodify -a -x -D $BINDDN -w $BINDPW
dn: dc=worcs,dc=sch,dc=uk
objectclass: top

EOF

# now create 2500 records

c=1
while [ $c -lt 2501 ]; do

  cat <<EOF | ldapmodify -a -x -D $BINDDN -w $BINDPW
dn: uid=test${c},dc=worcs,dc=sch,dc=uk
objectclass: posixAccount
cn: test${c}
uid: test${c}
uidnumber: $((c+500))
gidnumber: 100
homedirectory: /home/test${c}

EOF
  c=$((c+1))
done

# now lookup each record one by one and modify one item of data
# and we'll see how far we get....
broke=""
c=1
while [ $c -lt 2501 ]; do

  dn=`ldapsearch -LLL -x -b $SEARCHBASE -D $BINDDN -w $BINDPW
"(&(objectclass=posixAccount)(uid=test${c}))" dn`
  if [ -z "${dn}" ] && [ -z "$broke" ]; then broke="${c}"; fi
  cat <<EOF | ldapmodify -a -x -D $BINDDN -w $BINDPW
${dn}
changetype: modify
replace: gidnumber
gidnumber: 101

EOF
  c=$((c+1))
done

# now we'll seg fault the server :-)

c=1
while [ $c -lt 2501 ]; do

  dn=`ldapsearch -LLL -x -D $BINDDN -w $BINDPW
"(&(objectclass=posixAccount)(uid=test${c}))" dn`
  if [ $? -gt 0 ]; then
    echo "Server died reading record ${c}."
    echo "Modifications broke at record: ${broke}"
    exit 0
  else
    echo "Re-reading record ${c}=${dn}"
  fi
  c=$((c+1))
done

echo "Modifications broke at record: ${broke}"

# end of script 1

and the perl script....

#!/usr/bin/perl

use Net::LDAP;

$binddn = "cn=root,dc=worcs,dc=sch,dc=uk";
$bindpasswd = "secret";
$base = "dc=worcs,dc=sch,dc=uk";

$ldap = Net::LDAP->new("localhost");

$ldap->bind(dn => $binddn,
                password => $bindpasswd);

$result = $ldap->add(dn => "dc=worcs,dc=sch,dc=uk",
                attrs =>[objectclass => ['top']]);
if ($result->code()){ die $result->error();}

# now create 2500 records

for ($i = 1; $i <= 2500; $i++) {

        $result = $ldap->add(dn => "uid=test$i,dc=worcs,dc=sch,dc=uk",
                attrs => [ cn => "test$i",
                        uid => "test$i",
                        uidnumber => ($i+500),
                        gidnumber => 100,
                        homedirectory => "/home/test$i",
                        objectclass => [posixAccount]
                        ]);
        if($result->code()){
                print "failed to add user test$i: ". $result->error()."\n";
        } else {
                print "added user test$i\n";
        }
}

# now lookup each record one by one and modify one item of data
# and we'll see how far we get....

$broke = "";

for($i = 1; $i <= 2500; $i++) {
        $result = $ldap->search(base => $base,
                                filter =>
"(&(objectclass=posixAccount)(uid=test$i))",
                                attrs => [dn]);
        if ($result->count()){
                my $entry = $result->pop_entry();
                $r = $ldap->modify(dn => $entry->dn(),
                        replace => {gidnumber => 101});
                if($r->code) {
                        print "failed to modify test$i". $r->error(). "\n";
                } else {
                        print "modified ". $entry->dn() . "\n";
                }
        } else {
                print "Search failed for test$i\n";
                $broke = "broke at test$i: ".$result->error()."\n" unless
$broke;
        }
}

# now we'll seg fault the server :-)
for($i = 1; $i <= 2500; $i++) {
        $result = $ldap->search(base => $base,
                                filter =>
"(&(objectclass=posixAccount)(uid=test$i))",
                                attrs => [dn]);
        if ($result->code()) {
                print "Server died reading record $i\n";
                print "Modifications broke at record $broke\n";
                exit 0;
        }
        if ($result->count()) {
                $entry = $result->pop_entry();
                print "Re-reading record $i = ". $entry->dn() ."\n";
        } else {
                print "Couldn't re-read record $i\n";
        }
}

print "Modifications $broke\n";

# end of script 2