Re: Replication problem with slurpd

Raphaël Ouazana-Sustowski wrote:

I have a problem with slurpd. I have one master which replicates on 2
slaves which one replicates on one other slave.
M -> S1
 -> S/M -> S2
They are all OpenLDAP 2.2.11 (I tested once with 2.2.16 and had the same

Each night I delete all the entries of the 4 directories and re-populate
the master (M) with ldapadd. So slurpd of the Master populates S1 and
the Slave/Master (S/M) ; and slurpd of S/M populates S2.
The problem is that when I check if all entries are really in the 4
directories, I can see that some are missing (about 0 to 4, rarely
more)! For example I can have :
M : 118821 entries
S1 : 118819
M/S : 118819
S2 : 118818
Of course I have nothing in .rej files, and see nothing in slurpd logs
when I activate them.

Moreother the problem seems to be pretty hard to reproduce. I tried to
reproduce it with test data but don't success. So I'm looking for
methods to analyse the problem. For example how to log properly slurpd ?

See my comment in ITS#3421. As son as you can identify what DNs didn't make it into the slaves, you should first see the reason of the failure at the slapd's level; for this purpose, a loglevel of 256 should suffice: all you need to do is track down what conn/op is related to their operation; they should show up something like

conn=1 op=1 MOD dn="cn=James A Jones 1,ou=Alumni Association,ou=People,o=University of Michigan,c=US"

(ADD/MOD/DEL based on the type of operation they were subjected to at the time of the failure). Similar lines should appear on the master and on the slave; if they don't show up at the slave, this means rplication didn't occur at all; then we need to see if the master says anything at all about that. Later on, you should see lines like

conn=1 op=1 RESULT tag=107 err=0 text=

this can tell you more about te reason of the failre if any. The above, for instance, indicates success (err=0). The "conn=X op=Y" allows to match the operation description with the result. I'm not that familiar with slurpd's log; I see it tends to be very verbose (as slapd's, but I'm more accustomed to selecting the values I need), so I'm afraid you'll have to ask someone else, or experimet yourself. It's my understanding that you can get quite far by looking at the above first, to narrow down the problem.


