[Date Prev][Date Next] [Chronological] [Thread] [Top]

Database corruption problem



We are having a strange database corruption problem sometimes when we try
to add new objects.  Any advice would be appreciated.

Here are the syslog messages from a run that made the server hang:

May 27 10:54:24 directory1 slapd[20630]: [ID 848112 local4.debug] conn=76889 fd=37 ACCEPT from IP=127.0.0.1:42545 (IP=0.0.0.0:389)
May 27 10:54:24 directory1 slapd[20630]: [ID 347666 local4.debug] conn=76889 op=0 BIND dn="cn=Manager,o=sonoma,o=edu" method=128
May 27 10:54:24 directory1 slapd[20630]: [ID 378636 local4.debug] conn=76889 op=0 AUTHZ dn="cn=Manager,o=sonoma,o=edu" mech=simple ssf=0
May 27 10:54:24 directory1 slapd[20630]: [ID 217296 local4.debug] conn=76889 op=0 RESULT tag=97 err=0 text=
May 27 10:54:24 directory1 slapd[20630]: [ID 513112 local4.debug] conn=76889 op=1 ADD dn="uid=faulknet,ou=people,o=sonoma,o=edu"
May 27 10:54:24 directory1 slapd[20630]: [ID 446079 local4.debug] bdb(o=sonoma,o=edu): page 1091: illegal page type or format
May 27 10:54:24 directory1 slapd[20630]: [ID 446079 local4.debug] bdb(o=sonoma,o=edu): PANIC: Invalid argument
May 27 10:54:24 directory1 slapd[20630]: [ID 446079 local4.debug] bdb(o=sonoma,o=edu): givenName.bdb: pgin failed for page 1091
May 27 10:54:24 directory1 slapd[20630]: [ID 446079 local4.debug] bdb(o=sonoma,o=edu): fatal region error detected; run recovery
May 27 10:54:24 directory1 last message repeated 3 times

The LDIF we were trying to load is:
    dn: uid=faulknet,ou=people,o=sonoma,o=edu
    objectclass: top
    objectclass: person
    objectclass: organizationalPerson
    objectclass: inetOrgPerson
    objectclass: eduPerson
    objectclass: calstateEduPerson
    objectclass: sonomaEduPerson
    sonomaedupersonauthoritativesource: PeopleSoft
    calstateedupersonemplid: XXXXXXXXX
    sonomaedupersoncmsid: XXXXXXXX
    sn: Faulkner
    givenname: Timothy
    cn: Timothy Faulkner
    userpassword: {SSHA}NsonB/zHPH14RuPONXcaSsayYZ+y3KJMCJYc2A==
    sonomaedupersonencpw:: IIVQqKjxWH05WeL7Ttl677VNpd7FcZ9+
    sonomaedupersonpasswordexpiredate: 1074880420
    edupersonaffiliation: Student
    calstateedupersonferpaflag: true
    o: Sonoma State University
    uid: faulknet


Here is our DB_CONFIG:
    # 512 MB cache
    set_cachesize 0 536870912 1

    # 512 MB max log file
    set_lg_max 536870912

    # 128 MB log buffer size
    set_lg_bsize 134217728

    # 256 MB log region
    set_lg_regionmax 268435456

    set_flags DB_TXN_NOSYNC
    set_flags DB_TXN_WRITE_NOSYNC
    set_flags DB_DIRECT_DB
    set_flags DB_DIRECT_LOG

    # Slows things down.
    # set_flags DB_NOMMAP

    # Dangerous
    # set_flags DB_NOLOCKING


Here is the database definition from our slapd.conf (minus the password):
    database        bdb

    # readonly on

    # Performance stuff

    # This allows dirty data to be returned on a read.
    dirtyread

    # This turns off syncing after every write.
    dbnosync

    # checkpoint the database every 256 megabytes (half the cache) or 10 minutes,
    # whichever comes first
    checkpoint 262144 10

    cachesize 10000

    include         /opt/openldap/etc/openldap/replication.conf
    suffix          "o=sonoma,o=edu"
    rootdn          "cn=Manager,o=sonoma,o=edu"
    rootpw          XXXXXX
    directory       /opt/openldap/var/openldap-data

    index   objectClass     eq
    index   uid     eq,sub
    index   cn      eq,sub
    index   sn      eq,sub
    index   givenname       eq,sub
    index   o       eq,sub

    index   calstateEduPersonEmplid eq

    index   sonomaEduPersonAuthoritativeSource      eq
    index   sonomaEduPersonCmsId    eq
    index   sonomaEduPersonAccountDisabledReason    eq

    database bdb
    suffix          "dc=sonoma,dc=edu"
    rootdn          "cn=Manager,o=sonoma,o=edu"
    directory       /opt/openldap/var/dc-data