[Date Prev][Date Next]
Re: (ITS#6472) Syncrepl : loop problem with moddn on a new node
>> Full_Name: Julien COMBES
>> Version: 2.4.21
>> OS: Debian 5.0.4
>> URL: ftp://ftp.openldap.org/incoming/its-syncrepl-loop-moddn.tar.bz2
>> Submission from: (NULL) (220.127.116.11)
>> I think I have found a loop problem with syncrepl replication with
>> 2.4.21, BDB 4.7.25 with all patches and hdb database. The problem
>> sometimes when an entry is moved with "modrdbn -s" in a node which has
>> just been
>> created. I have reproduced the problem with the creation of a node and a
>> while the consumer was stopped and then restarted after.
>> The problem follows these steps :
>> - When it starts, the consumer does a request objectClass=* on the
>> provider :
>> Feb 12 09:09:19 ldapma24-ida01 slapd: conn=1007 op=1 SRCH
>> base="dc=my,dc=domain" scope=2 deref=0 filter="(objectClass=*)"
>> - The consumer finds the modrdn and tries to do this :
>> Feb 12 09:09:19 ldapra24-ida01 slapd:
>> - The consumer fails with these errors :
>> Feb 12 09:09:19 ldapra24-ida01 slapd: =>
>> Feb 12 09:09:19 ldapra24-ida01 slapd: <= hdb_dn2id: get failed:
>> DB_NOTFOUND: No matching key/data pair found (-30988)
>> Feb 12 09:09:19 ldapra24-ida01 slapd: hdb_modrdn:
>> newSup(ndn=ou=x,dc=my,dc=domain) not here!
>> Feb 12 09:09:19 ldapra24-ida01 slapd: send_ldap_result: conn=-1
>> op=0 p=0
>> Feb 12 09:09:19 ldapra24-ida01 slapd: send_ldap_result: err=32
>> text="new superior not found"
>> - The consumer retries the request objectClass=* on the provider and
>> loops on
>> the problem. The replication doesn't work anymore.
>> To reproduce the problem, I have used these steps :
>> - start an empty provider
>> - ldapadd the entries in mydomain.ldif
>> ldapadd -x -h 127.0.0.1 -D "dc=my,dc=domain" -W -f mydomain.ldif
>> - start the consumer.
>> - stop the consumer when replication is finished
>> - ldapadd the new node
>> ldapadd -x -h 127.0.0.1 -D "dc=my,dc=domain" -W -f add.ldif
>> - modrdn -s
>> ldapmodrdn -x -h 127.0.0.1 -D "dc=my,dc=domain" -W -r -s
>> "cn=user1,ou=A,dc=my,dc=domain" "cn=user1"
>> - start the consumer
>> I join in its-syncrepl-loop-moddn.tar.bz2 :
>> - slapd.conf of provider and consummer
>> - log files of provider and consummer
>> - mydomain.ldif and add.ldif
> Thanks for the detailed report. The bug is confirmed, and it's not
> related to back-hdb, but seems to be syncrepl-related in general.
It's not clear to me where the issue is. What is the "right" sequence the
add of the new superior and the mordrdn should be transmitted? Should the
provider operate differently, or should the consumer check all syncrepl
messages and try to rebuild the final state, instead of giving up when the
internal lookup for the newsuperior fails? Probably, a workaround could
be to perform the modrdn by crating the new superior as a glue object,
which eventually will be replaced by the actual add.