[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Syncrepl-Consumer deletes entries



Joachim Hergeth wrote:
The initial synchronization of the consumer works as expected. All LDAP
entries are copied to the consumer directory. But after some time,
usually when users log in into the Samba running with the provider LDAP,
nearly 50% of all LDAP entries on the consumer are deleted. This happens
without any change on the provider LDAP!

I am experiencing a similar problem, database entries being deleted when they shouldn't be, on a number of replicas. This is causing disruption to users and we'd really like to get it cleared up.


Here's the environment:
OpenLDAP 2.3.32 running on Debian 3.1 (Sarge)
compiled with sync logging patch discussed about 4 months ago
loglevel  config sync on all servers
BDB 4.2 backend
Syncrepl replication all round
A "master" server (com)
      - holds the master copy of the database
A number of servers that replicate directly from com
An "intermediate" server (wwsv04) that
      - is on the same LAN and subnet as com
      - replicates from com
      - acts as provider for all other servers
88 servers/replicas in total
Approx 9000 records
All replicas are supposed to be complete copies
Nothing particularly fancy or clever going on

I have spent today dissecting the logs from two incidents this week in which entries were erroneously deleted. Although the circumstances of the two incidents are quite different, from examining the logs I believe it is the same thing happening in each case.

Incident 1:
A test server ran out of space in /var. After cleaning up, rebooting and doing db_recover, 244 entries were erroneously deleted. This server is on the same building network as com and wwsv04 (different subnet), and replicates from wwsv04.


Incident 2:
A production server on the other side of the world lost contact with its provider, and after reconnecting, erroneously deleted 249 entries. This server is connected to the main network via an OpenVPN tunnel across the internet, and replicates from com. (The really odd thing about this is that the network link came back up over an hour 1hr before the first timestamp of this incident, but that is most likely a separate issue.)


I have attached three files (I hope they're not too big):
   - log_analysis.ods - an OpenOffice spreadsheet containing
        correlated log entries and my comments for both incidents
   - incident_1.tgz - syslogs relating to incident 1
                    - and a CSV version of my analysis
                      of incident 1
   - incident_2.tgz - likewise for incident 2



Replication configs are as follows:
---------------------------------
com
---
# Syncrepl provider

overlay syncprov
syncprov-checkpoint 10 5
syncprov-sessionlog 100
syncprov-reloadhint TRUE
-------------------------------------
wwsv04
------
# Syncrepl provider

overlay syncprov
syncprov-checkpoint 10 5
syncprov-sessionlog 100
syncprov-reloadhint TRUE

# syncrepl consumer

syncrepl rid=123
    provider=ldap://com.example.co.nz
    type=refreshAndPersist
    searchbase="dc=example,dc=co,dc=nz"
    scope=sub
    schemachecking=off
    bindmethod=simple
    binddn="cn=root,dc=example,dc=co,dc=nz"
    credentials=secret
    retry=5,5,30,5,60,5,300,+
-------------------------------------
zzsv01
------
syncrepl rid=123
    provider=ldap://wwsv04.example.co.nz
    type=refreshAndPersist
    searchbase="dc=example,dc=co,dc=nz"
    scope=sub
    schemachecking=off
    bindmethod=simple
    binddn="cn=root,dc=example,dc=co,dc=nz"
    credentials=secret
    retry=5,5,30,5,60,5,300,+
-------------------------------------
nosv01
------
syncrepl rid=123
    provider=ldap://com.example.co.nz
    type=refreshAndPersist
    searchbase="dc=example,dc=co,dc=nz"
    scope=sub
    schemachecking=off
    bindmethod=simple
    binddn="cn=root,dc=example,dc=co,dc=nz"
    credentials=secret
    retry=5,5,30,5,60,5,300,+
-------------------------------------



--
Lesley Walker
Linux Systems Administrator
Opus International Consultants Ltd
Email lesley.walker@opus.co.nz
Tel +64 4 471 7002, Fax +64 4 473 3017
http://www.opus.co.nz
Level 9  Majestic Centre, 100 Willis Street, PO Box 12 343
Wellington, New Zealand

Attachment: log_analysis.ods
Description: Binary data

Attachment: incident_1.tgz
Description: application/compressed

Attachment: incident_2.tgz
Description: application/compressed