[Date Prev][Date Next] [Chronological] [Thread] [Top]

Inconsistent syncrepl behaviour with glue records (ITS#3133)



Full_Name: Luke Howard
Version: 2.2.10
OS: Linux
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (165.228.130.12)


I am using OpenLDAP 2.2.10 with the patch in ITS#3131.

The master DSA has three databases, each immediately subordinate to the next.
The relevant sections of the configuration file on the master follow:

---CUT HERE---
defaultsearchbase "DC=foo,DC=bar"

database hdb
suffix "CN=A,CN=B,DC=foo,DC=bar"
<snip>

database hdb
suffix "CN=B,DC=foo,DC=bar"
<snip>

database hdb
suffix "DC=foo,DC=bar"
<snip>

database dnssrv
suffix ""
---CUT HERE---

The replica configuration is similar, in that there are three databases (each
for shadowing the corresponding one on the master) and a default search base and
the DNS SRV backend are both enabled. The relevant sections from the replica
configuration file follow:

---CUT HERE---
database hdb
suffix "CN=A,CN=B,DC=foo,DC=bar"
lastmod on
<snip>
syncrepl rid=1
  provider=ldap://master
  updatedn="CN=manager,DC=foo,DC=bar"
  binddn="CN=replicaDN,DC=foo,DC=bar"
  bindmethod=sasl
  saslmech=GSSAPI
  secprops=minssf=56
  searchbase="CN=A,CN=B,DC=foo,DC=bar"
  filter="(objectClass=*)"
  attrs="*"
  schemachecking=off
  scope=sub
  type=refreshOnly
  interval=00:00:00:10

updateref ldap://master
---CUT HERE---

The above stanza is repeated twice with the suffix and searchbase changed, for
both "CN=B,DC=foo,DC=bar" and "DC=foo,DC=bar".

Now, to explain the problem:

After provisioning the replica, sometimes the entry at "DC=foo,DC=bar" is synced
correctly from the master, and at other times a glue record is stored. I suspect
that this is a race condition dependent on which database finishes syncing
first. Because the entry at "DC=foo,DC=bar" on the master contains useful
information on which our application depends, the glue record breaks our
application when it uses a replica. As it happens I can duplicate this problem
on one replica DSA, running on Fedora Core 1 under VMware, but not another,
running on SuSE 9.0 on real hardware. However, I have seen the problem manifest
on the latter system in the past.

The glue record appears as follows (DN has been changed for privacy):

---CUT HERE---
# slapcat -n 3
dn: dc=foo,dc=bar
structuralObjectClass: glue
objectClass: top
objectClass: glue
---CUT HERE---

whereas it should really be an instance of object class "domain".