[Date Prev][Date Next] [Chronological] [Thread] [Top]

More glue / syncrepl issues (related ITS#4323?)



Full_Name: Kevin Spicer
Version: 2.3.24
OS: Solaris 9 sparc
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (198.178.236.140)


Hi, I thought all the issues related to ITS#4323 were solved but this afternoon
part of  my distributed directory got wiped and I had to restore from backup. 
This looks very like some of the original issues I had which caused me to log
4323.

The circumstances of this afternoons problem were that one of my servers (which
is master for one of my subordinate databases and a slave for the superior and
its other subordinates) became detached from the network [someone unpatched it
by mistake].  It was offline for about an hour until someone patched it back in.
 At this point some recently created entries in the superior db were missing,
but restarting slapd made these reappear.  A little later in the afternoon I got
a call from clients of that machine saying that data they expected wasn't there.
 I found that the superior (slave) data was intact, but the subordinate master
(well actually a couple of recently created - since the network problem -
entries were there) and two of the three subordinate slaves were empty.  The
empty master had replicated its empty self to the other slaves.  In order to
recover the subordinate master I had to restore from last nights slapcat.  The
subordinate slaves were recovered by deleting the database files and letting
syncrepl resync.

Heres a brief summary of my setup

dc=domain,dc=com      # master on server1, slave on others
glued subordinates....
ou=server2,ou=machines,dc=domain,dc=com  # master on server2, slave on others
ou=server3,ou=machines,dc=domain,dc=com  # master on server3, slave on others
ou=server4,ou=machines,dc=domain,dc=com  # master on server4, slave on others

This is openldap 2.3.24 compiled with...
./configure --prefix=/usr/local --enable-bdb --enable-crypt --with-threads
--with-tls --without-kerberos --enable-wrappers --enable-modules
--enable-ppolicy=mod --enable-syncprov=mod

Here's a slapd.conf from server2 (the one with the problem)

# [ VARIOUS SCHEMA DIRECTIVES ]
allow bind_v2
sizelimit 10000
loglevel 256
pidfile         /var/run/slapd/slapd.pid
argsfile        /usr/local/var/slapd.args
defaultsearchbase dc=domain,dc=com
threads 8
password-hash {MD5}
modulepath      /usr/local/libexec/openldap
moduleload      ppolicy.la
moduleload      syncprov.la
TLSCipherSuite HIGH:+TLSv1:+SSLv2:+SSLv3
TLSCACertificateFile /usr/local/etc/openldap/certs/cacert.pem
TLSCertificateFile /usr/local/etc/openldap/certs/server2.slapd-cert.pem
TLSCertificateKeyFile /usr/local/etc/openldap/certs/server2.slapd-key.pem
security ssf=0 tls=0 update_ssf=128 simple_bind=128 update_tls=128
# [ VARIOUS ACL ENTRIES ]
# This is the database definition for the portion of the DIT which
# relates directly to this machine
database        bdb
suffix          "ou=server2,ou=machines,dc=domain,dc=com"
rootdn          "cn=Manager,dc=domain,dc=com"
directory       /var/db/ldap/server2
mode            0600
subordinate
overlay         syncprov
syncprov-checkpoint     100 10
syncprov-sessionlog     100
# [ VARIOUS INDICES ]
cachesize 5000
checkpoint 512 720
## This is a database definition for a replica of one of the subsidiary DITS
database        bdb
suffix          "ou=server3,ou=machines,dc=domain,dc=com"
rootdn          "cn=Manager,dc=domain,dc=com"
syncrepl rid=122
        provider=ldaps://server3.domain.com
        type=refreshAndPersist
        retry=30,10 120,30 300,+
        binddn=cn=syncuser,dc=domain,dc=com
        bindmethod=simple
        credentials=secret
        searchbase="ou=server3,ou=machines,dc=domain,dc=com"
updateref       ldaps://server3.domain.com
directory       /var/db/ldap/server3
mode            0600
subordinate
# [ VARIOUS INDICES ]
cachesize 5000
checkpoint 512 720
## This is a database definition for a replica of one of the subsidiary DITS
database        bdb
suffix          "ou=server4,ou=machines,dc=domain,dc=com"
rootdn          "cn=Manager,dc=domain,dc=com"
syncrepl rid=123
        provider=ldaps://server4.domain.com
        type=refreshAndPersist
        retry=30,10 120,30 300,+
        binddn=cn=syncuser,dc=domain,dc=com
        bindmethod=simple
        credentials=secret
        searchbase="ou=server4,ou=machines,dc=domain,dc=com"
updateref       ldaps://server4.domain.com
directory       /var/db/ldap/server4
mode            0600
subordinate
# [ VARIOUS INDICES ]
cachesize 5000
checkpoint 512 720
# This portion deals with the main database on REPLICA servers
database        bdb
suffix          "dc=domain,dc=com"
rootdn          "cn=Manager,dc=domain,dc=com"
rootpw          {MD5}XXXXXXXXXXX
syncrepl rid=121
        provider=ldaps://server1.group.local
        type=refreshAndPersist
        retry=30,10 120,30 300,+
        binddn=cn=syncuser,dc=domain,dc=com
        bindmethod=simple
        credentials=secret
        searchbase="dc=domain,dc=com"
updateref       ldaps://server1.group.local
directory       /var/db/ldap/server1
mode            0600
overlay         glue
overlay         ppolicy
ppolicy_default "cn=systemusers,ou=policy,dc=domain,dc=com"
ppolicy_use_lockout
# [ VARIOUS INDICES ]
cachesize 5000
checkpoint 512 720