[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: RE2.4



Howard Chu wrote:
Raphaël Ouazana-Sustowski wrote:
Hi,

Starting test050-syncrepl-multimaster ...
running defines.sh
Initializing server configurations...
Starting producer slapd on TCP/IP port 9011...
Using ldapsearch to check that producer slapd is running...
Inserting syncprov overlay on producer...
Starting consumer slapd on TCP/IP port 9012...
Using ldapsearch to check that consumer slapd is running...
Configuring syncrepl on consumer...
Starting consumer2 slapd on TCP/IP port 9013...
Using ldapsearch to check that consumer2 slapd is running...
Configuring syncrepl on consumer2...
Adding schema and databases on producer...
Using ldapadd to populate producer...
Waiting 20 seconds for syncrepl to receive changes...
Using ldapadd to populate consumer...
Waiting 20 seconds for syncrepl to receive changes...
Using ldapsearch to check that syncrepl received database changes...
Waiting 5 seconds for syncrepl to receive changes...
Waiting 5 seconds for syncrepl to receive changes...
Waiting 5 seconds for syncrepl to receive changes...
Waiting 5 seconds for syncrepl to receive changes...
Waiting 5 seconds for syncrepl to receive changes...
Waiting 5 seconds for syncrepl to receive changes...
ldapsearch failed (32)!
./scripts/test050-syncrepl-multimaster failed (exit 32)
make[2]: *** [hdb-yes] Erreur 32
make[2]: quittant le répertoire « /tmp/openldap/tests »
make[1]: *** [test] Erreur 2
make[1]: quittant le répertoire « /tmp/openldap/tests »
make: *** [test] Erreur 2

OK. Here's the apparent sequence of events: server1 starts up

database is defined, syncrepl consumer starts, fails, retries

dc=example entries start getting added

serverX consumer connects to server1 and starts receiving entries

server1 consumer starts up and connects to serverX

dc=example entries continue to be added on server1

The problem is that server1's consumer has snapped the ctxcsn while adds are ongoing, and serverX actually has a newer ctxcsn. E.g.:

entry1 is added on server1
server1 consumer gets ctxcsn for entry1
entry2 is added on server1
serverX consumer connects,
gets entry1 and entry2, result ctxcsn2
server1 consumer connects to serverX, sends ctxcsn1
entry3-N is added on server1
serverX sends server1 a refreshResult with ctxcsn2,
and a presentlist of just entry1
server1 performs a delete_nonpresent based on ctxcsn1, ctxcsn2, and the
presentlist, even though it has newer data. All the newer entries are deleted...


The fix seems to be to have the consumer re-fetch the current ctxcsn before deciding whether to do a delete_nonpresent pass. The previous patch to syncprov.c was irrelevant and will be reverted.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/