[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: replication factored out of slapd



On Thu, 2005-12-15 at 10:38 +0100, Pierangelo Masarati wrote:

> I think it's a great idea; it would also solve the issue with syncrepl
> that we can't use it when the master is behind a firewall that doesn't
> allow LDAP connections inwards.
> 
> I'm setting up a 3 slapd test (test045) that checks this.  I note that
> it works just fine with the "consumer" overlay on the back-ldap and the
> "slurprov" overlay on the slave...

Now I think I was a bit too enthusiast.  As a baseline, it works; it's
not resilient:
1) internal operations do not set o_version, which is 0 and results in
back-ldap using LDAPv2+
2) when the replica is down, syncrepl operations via back-ldap fail;
these failures seem to be handled incorrectly, since no replication
occurs even after the replica comes up; only when the proxy is restarted
replication starts again, but involving only new modifications; those in
between are "lost"; I guess some full refresh should occur in case
syncrepl recovers from a transient internal malfunction.
3) at startup, I see scary logs like

conn=1 fd=11 ACCEPT from IP=127.0.0.1:35943 (IP=127.0.0.1:9012)
conn=1 op=0 BIND dn="" method=128
conn=1 op=0 RESULT tag=97 err=0 text=
conn=1 op=1 SRCH base="dc=example,dc=com" scope=0 deref=0
filter="(objectClass=*)"
conn=1 op=1 SRCH attr=contextCSN
conn=1 op=1 SEARCH RESULT tag=101 err=32 nentries=0 text=
conn=2 fd=12 ACCEPT from IP=127.0.0.1:35945 (IP=127.0.0.1:9012)
conn=2 op=0 BIND dn="" method=128
conn=2 op=0 RESULT tag=97 err=0 text=
conn=2 op=1 SRCH base="dc=example,dc=com" scope=0 deref=0
filter="(objectClass=*)"
conn=2 op=1 SRCH attr=contextCSN
conn=2 op=1 SEARCH RESULT tag=101 err=32 nentries=0 text=
conn=2 op=2 DISCONNECT tag=120 err=2 text=unexpected data in PDU
do_search: get_ctrls failed
conn=2 fd=12 closed (operations error)
conn=3 fd=12 ACCEPT from IP=127.0.0.1:35946 (IP=127.0.0.1:9012)
conn=3 op=0 BIND dn="cn=Replica,dc=example,dc=com" method=128
conn=3 op=0 BIND dn="cn=Replica,dc=example,dc=com" mech=SIMPLE ssf=0
conn=3 op=0 RESULT tag=97 err=0 text=
conn=3 op=1 DISCONNECT tag=120 err=2 text=unexpected data in PDU
do_search: get_ctrls failed
conn=3 fd=12 closed (operations error)

on the replica; the first conn is just fine: the replica is empty; I
don't quite understand the second: why re-run that search if the former
gave noSuchObject?  And the error is even more puzzling.

I'll commit the test anyway, and disable it by default, so the willing
can have a look at it and help making it work.

p.

p.




Ing. Pierangelo Masarati
Responsabile Open Solution

SysNet s.n.c.
Via Dossi, 8 - 27100 Pavia - ITALIA
http://www.sys-net.it
------------------------------------------
Office:   +39.02.23998309          
Mobile:   +39.333.4963172
Email:    pierangelo.masarati@sys-net.it
------------------------------------------