[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: null_callbacks after initial sync

Howard Chu wrote:
> Nick Geron wrote:
>> We're now thinking some of our issues may be attributable to time
>> granularity issues.  We're seeing missing information on the consumer if
>> multiple successive writes are attempted via a script.  If we slow down
>> to human speed or insert sleeps in our test code, this gets a little
>> better.  I see that A.2.4 N-Way MultiMaster Replication notes that
>> entryCSNs now record with microseconds, but does this apply to mirrors
>> as well?
> CSNs were extended to microsecond resolution only for the benefit of
> conflict resolution. For all other purposes, the changecount field
> ensures sufficient granularity.
In that case, why do we see any difference in propagation between
scripted (quick) updates and hand/command line (slow) modifications?  Or
are you simply saying time is not the issue?

For example - manipulating one particular entry:

1) update server 1 adding 1 attribute = propagates to second server
* wait a few seconds
2) update server 1 adding 4 attributes = first of four propagates to
second server

After waiting a second or so, another successful operation on the
'write' server will propagate all modifications over to the second
server as expected.  This behavior is why we suspected a time
granularity issue.  It should be noted that this doesn't work for us
(and others as I would expect) as there is no guarantee that another
operation on the 'write' server will occur, thereby propagating the
current entry.
>> Can I setup a two node N-Way?
> "2" is certainly a valid value of "N".
Well, there's that developer 'charm' I've been reading throughout years
of archives.  Since the admin doc make a distinction between the 'hybrid
configuration' of MirrorMode and N-Way Multi-Master, I was more looking
for clarification between the two implementations.  Not a jackass comment.

> Syncrepl doesn't write session logs. Read RFC4533.
I'll look into it.  Thanks.

Switching gears, what would the devs say is the capabilities in
operations per second with 2.4.7?  I'm seeing a number of aborts when
testing under high load.  The latest came from running scripted
ldapsearches and ldapmodifies which resulted in a mutex error (or so I
am told by one of our developers).


1) adding about 100 attributes to an entry
2) diffing the output of ldapsearch between the two nodes in loop
3) once synced, grabbing the attributes, shoving them in a temp file
with delete instructions and using that with ldapmodify.

I complied with debugging on which results in an abort with
"connection.c: 676: connection_state_closing: Assertion 'c_struct_state
== 0x02' failed" logged.

When tracing is employed (using strace) I have yet to see this behavior.