[Date Prev][Date Next]
Re: OpenLDAP 2.4.15 : CSN too old when using 4-way multimaster
- To: email@example.com
- Subject: Re: OpenLDAP 2.4.15 : CSN too old when using 4-way multimaster
- From: Howard Chu <firstname.lastname@example.org>
- Date: Mon, 02 Mar 2009 03:18:23 -0800
- Cc: email@example.com
- In-reply-to: <firstname.lastname@example.org>
- References: <email@example.com> <5C6FF28BD1055F1890ECA777@[192.168.1.199]> <firstname.lastname@example.org>
- User-agent: Mozilla/5.0 (X11; U; Linux x86_64; rv:1.9.1b3pre) Gecko/20090227 SeaMonkey/2.0a1pre Firefox/3.0.3
Adrien Futschik wrote:
Considering that M1& M3 are on the same server and therefore have exactly the
same time, if this was a time related problem, I should'nt get any "CSN too
old" messages between M1&M3 and M2&M4, should I ?
I have also noticed that when M1 gets a new entry and passes it to M2&M3&M4,
when M2&M3&M4 revieve it, they also pass it to M2&M3&M4 ! I don't understand
why this happends but it look's very much like this is what's happening,
because sometimes, M2 would have passed-it to M4, before M4 has actualy
revieved the add order from M1.
I therefore happend to notice that sometimes, entries send from M1 are
revieved in the wrong ordrer by other masters and therefore some entries may
be skipped !!!
Yes, that makes sense. The CSN check assumes changes will always be received
in the same order they were sent from the provider. Obviously in this case
this assumption is wrong. You should submit an ITS for this.
This problem was discussed on the -devel list back in 2007; the code ought to
be using a spanning tree/routing algorithm to ensure that when multiple routes
exist for propagating a change, the change is delivered exactly once.
Unfortunately no one has spent any further time on this issue since then.
Here is a example :
I add cn=M1client1& cn=M1client2 on M1,
M1client1& M1client2 are successuly replicated on M2&M4 but on M3, only
M1client2 is inserted and I am getting an "CSN too old" message for M1client1
I don't have the logfile here, I'll send extracts this monday.
I am also getting this messages from time to time :
=> bdb_idl_insert_key: c_put id failed: DB_LOCK_DEADLOCK: Locker killed to
resolve a deadlock (-30994)
=> bdb_dn2id_add 0x1e40: parent (ou=clients,o=edf,c=fr) insert failed: -30994
I guess this is because all 4 masters recieve entries that have the same
parent : ou=clients,o=edf,c=fr and that happends if two entries are "inserted"
DB_LOCK_DEADLOCK messages can always be ignored; back-bdb always retries when
it hits a deadlock.
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/