[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: null_callbacks after initial sync



Howard Chu wrote:
> Nick Geron wrote:
>> We're now thinking some of our issues may be attributable to time
>> granularity issues.  We're seeing missing information on the consumer if
>> multiple successive writes are attempted via a script.  If we slow down
>> to human speed or insert sleeps in our test code, this gets a little
>> better.  I see that A.2.4 N-Way MultiMaster Replication notes that
>> entryCSNs now record with microseconds, but does this apply to mirrors
>> as well?
>
> CSNs were extended to microsecond resolution only for the benefit of
> conflict resolution. For all other purposes, the changecount field
> ensures sufficient granularity.
In that case, why do we see any difference in propagation between
scripted (quick) updates and hand/command line (slow) modifications?  Or
are you simply saying time is not the issue?

For example - manipulating one particular entry:

1) update server 1 adding 1 attribute = propagates to second server
* wait a few seconds
2) update server 1 adding 4 attributes = first of four propagates to
second server

After waiting a second or so, another successful operation on the
'write' server will propagate all modifications over to the second
server as expected.  This behavior is why we suspected a time
granularity issue.  It should be noted that this doesn't work for us
(and others as I would expect) as there is no guarantee that another
operation on the 'write' server will occur, thereby propagating the
current entry.
>
>> Can I setup a two node N-Way?
>
> "2" is certainly a valid value of "N".
Well, there's that developer 'charm' I've been reading throughout years
of archives.  Since the admin doc make a distinction between the 'hybrid
configuration' of MirrorMode and N-Way Multi-Master, I was more looking
for clarification between the two implementations.  Not a jackass comment.

> Syncrepl doesn't write session logs. Read RFC4533.
>
I'll look into it.  Thanks.


Switching gears, what would the devs say is the capabilities in
operations per second with 2.4.7?  I'm seeing a number of aborts when
testing under high load.  The latest came from running scripted
ldapsearches and ldapmodifies which resulted in a mutex error (or so I
am told by one of our developers).

Specifically:

1) adding about 100 attributes to an entry
2) diffing the output of ldapsearch between the two nodes in loop
3) once synced, grabbing the attributes, shoving them in a temp file
with delete instructions and using that with ldapmodify.

I complied with debugging on which results in an abort with
"connection.c: 676: connection_state_closing: Assertion 'c_struct_state
== 0x02' failed" logged.

When tracing is employed (using strace) I have yet to see this behavior.