Issue 8672 - syncrepl with openldap 2.4.{40,42} and mdb backend
Summary: syncrepl with openldap 2.4.{40,42} and mdb backend
Status: VERIFIED SUSPENDED
Alias: None
Product: OpenLDAP
Classification: Unclassified
Component: slapd (show other issues)
Version: unspecified
Hardware: All All
: --- normal
Target Milestone: ---
Assignee: OpenLDAP project
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-06-08 13:06 UTC by remy.dernat@umontpellier.fr
Modified: 2020-03-23 15:06 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description remy.dernat@umontpellier.fr 2017-06-08 13:06:21 UTC
Full_Name: Dernat R.my
Version: 2.4.40+dfsg-1+deb8u3 and 2.4.42+dfsg-2ubuntu3.2
OS: Debian and Ubuntu
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (162.38.181.76)


Hi,

Since I moved my OpenLDAP to another server my replication between 2 ldap
servers through syncrepl does not work anymore. I tested many many things.
Finally, I decided to backup the database and restore it on another server (so,
I have 3 ldap servers) and  (...) it worked.

After many other tests, I was able to determine the source of this issue. With a
HDB backend on my provider my replication works, while it did not work with a
MDB backend on the provider.

I had this kind of logs on the provider (with MDB):
=============================================================================
Jun  8 12:22:03 ldap2 slapd[15083]: send_search_entry: conn 20855  ber write
failed.
Jun  8 12:24:03 ldap2 slapd[15083]: send_search_entry: conn 20888  ber write
failed.
...
=============================================================================
While, on the slave, I get:
=============================================================================
Jun  8 09:33:32 ldap3-bis slapd[88560]: do_syncrepl: rid=010 rc -1 retrying
Jun  8 09:38:32 ldap3-bis slapd[88560]: do_syncrep2: rid=010 got search entry
without Sync State control (dc=my,dc=domain,dc=com)
Jun  8 09:38:32 ldap3-bis slapd[88560]: do_syncrepl: rid=010 rc -1 retrying
Jun  8 09:43:32 ldap3-bis slapd[88560]: do_syncrep2: rid=010 got search entry
without Sync State control (dc=my,dc=domain,dc=com)
...
=============================================================================

I am able to reproduce the bug quite easily.


I added only two schemas : autofs and quota.
With 
=============================================================================
ldapadd -Q -Y EXTERNAL -H ldapi:/// -f autofs.ldif
ldapadd -Q -Y EXTERNAL -H ldapi:/// -f quota.ldif
=============================================================================

I also loaded accesslog module (I am creating a specific directory for
accesslog(*)) on the provider and the syncprov module on both sides.

=============================================================================
(*)
mdkir /var/lib/ldap/accesslog
chown openldap:openldap /var/lib/ldap/accesslog
=============================================================================

Here is what I am doing to setup the syncrepl :

ldapadd -Q -Y EXTERNAL -H ldapi:/// -f file.ldif


With file.ldif, on the provider site (replication.ldif ; replacing HDB with MDB
to test with a MDB backend):

=============================================================================
#Load the syncprov and accesslog modules.
dn: cn=module{0},cn=config
changetype: modify
add: olcModuleLoad
olcModuleLoad: syncprov
-
add: olcModuleLoad
olcModuleLoad: accesslog

# Accesslog database definitions
dn: olcDatabase={2}hdb,cn=config
objectClass: olcDatabaseConfig
objectClass: olcHdbConfig
olcDatabase: {2}hdb
olcDbDirectory: /var/lib/ldap/accesslog
olcSuffix: cn=accesslog
olcRootDN: cn=XXXX,dc=YYYY,dc=ZZ
olcDbIndex: default eq
olcDbIndex: entryCSN,objectClass,reqEnd,reqResult,reqStart

# Accesslog db syncprov.
dn: olcOverlay=syncprov,olcDatabase={2}hdb,cn=config
changetype: add
objectClass: olcOverlayConfig
objectClass: olcSyncProvConfig
olcOverlay: syncprov
olcSpNoPresent: TRUE
olcSpReloadHint: TRUE

# syncrepl Provider for primary db
dn: olcOverlay=syncprov,olcDatabase={1}hdb,cn=config
changetype: add
objectClass: olcOverlayConfig
objectClass: olcSyncProvConfig
olcOverlay: syncprov
olcSpNoPresent: TRUE

# accesslog overlay definitions for primary db
dn: olcOverlay=accesslog,olcDatabase={1}hdb,cn=config
objectClass: olcOverlayConfig
objectClass: olcAccessLogConfig
olcOverlay: accesslog
olcAccessLogDB: cn=accesslog
olcAccessLogOps: writes
olcAccessLogSuccess: TRUE
# scan the accesslog DB every day, and purge entries older than 7 days
olcAccessLogPurge: 07+00:00 01+00:00
=============================================================================


On the consumer (with a unique rid, and by replacing HDB with MDB to test with a
MDB backend), the file.ldif looks like:

=============================================================================
#Load the syncprov module.
dn: cn=module{0},cn=config
changetype: modify
add: olcModuleLoad
olcModuleLoad: syncprov

# syncrepl specific indices
dn: olcDatabase={1}hdb,cn=config
changetype: modify
add: olcDbIndex
olcDbIndex: entryUUID eq
-
add: olcSyncRepl
olcSyncRepl: rid=1
    provider=ldaps://consumer.mydomain.fr
    bindmethod=simple
    binddn="cn=XXXXX,dc=mydomain,dc=fr"
    credentials=XXXX
    searchbase="dc=mydomain,dc=fr" logbase="cn=accesslog"
    logfilter="(&(objectClass=auditWriteObject)(reqResult=0))
    schemachecking=off
    type=refreshAndPersist retry="60 +"
    syncdata=accesslog
-
add: olcUpdateRef
olcUpdateRef: ldaps://consumer.mydomain.fr

=============================================================================
On the provider I am using:

=============================================================================
lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description:    Debian GNU/Linux 8.8 (jessie)
Release:        8.8
Codename:       jessie

dpkg -l slapd
ii  slapd                   2.4.40+dfsg-1+de amd64            OpenLDAP server
(slapd)

=============================================================================

Same configuration on one slave, and on the other slave, I am using:

=============================================================================
Distributor ID: Ubuntu
Description:    Ubuntu 16.04.2 LTS
Release:        16.04
Codename:       xenial

dpkg -l slapd
ii  slapd                   2.4.42+dfsg-2ubu amd64            OpenLDAP server
(slapd)

=============================================================================


Best regards,
R�my

Comment 1 Quanah Gibson-Mount 2017-06-08 13:35:56 UTC
--On Thursday, June 08, 2017 2:06 PM +0000 remy.dernat@umontpellier.fr 
wrote:


> Best regards,
> R?my

Hi,

Thanks for the report.  Please confirm you can reproduce the issue with 
OpenLDAP 2.4.45.  If you can, then please provide your /initial/ cn=config 
database prior to doing the modifications as well.

Thanks!

--Quanah

--

Quanah Gibson-Mount
Product Architect
Symas Corporation
Packaged, certified, and supported LDAP solutions powered by OpenLDAP:
<http://www.symas.com>


Comment 2 Ryan Tandy 2017-06-11 01:57:35 UTC
On Thu, Jun 08, 2017 at 01:06:21PM +0000, remy.dernat@umontpellier.fr wrote:
>I am able to reproduce the bug quite easily.

I'm afraid I have not been able to. I followed the steps you posted 
(with s/hdb/mdb/) with both servers running slapd 2.4.40+dfsg-1+deb8u3 
and syncrepl seems to work fine.

https://bugs.debian.org/798964 seems quite similar. R�my, do you think 
this problem started for you right after you upgraded to +deb8u3?

Comment 3 remy.dernat@umontpellier.fr 2017-06-12 13:38:38 UTC
Hi Ryan, Quanah,


Le 11/06/2017 à 03:57, Ryan Tandy a écrit :
> On Thu, Jun 08, 2017 at 01:06:21PM +0000, remy.dernat@umontpellier.fr 
> wrote:
>> I am able to reproduce the bug quite easily.
>
> I'm afraid I have not been able to. I followed the steps you posted 
> (with s/hdb/mdb/) with both servers running slapd 2.4.40+dfsg-1+deb8u3 
> and syncrepl seems to work fine.
>
> https://bugs.debian.org/798964 seems quite similar. R�my, do you think 
> this problem started for you right after you upgraded to +deb8u3?

Indeed, I am afraid that this upgrade causes this issue, or maybe rather 
the previous one. `2.4.40+dfsg-1+deb8u2` were upgraded on my system in 
February and syncrepl does not work from this moment.

Concerning tests from the latest source, I am not able to reproduce this 
bug as I get a weird error, with some stuff remaining from my previous 
install:
```
LDAP vendor version mismatch: library 20440, header 20445
```
Google leads me to this topics :
https://bbs.archlinux.org/viewtopic.php?id=56872
http://www.openldap.org/lists/openldap-software/200307/msg00399.html


Although I removed every previous packages.


Using ldd :
```
ldd `which ldapadd`
         linux-vdso.so.1 (0x00007ffe19765000)
         libldap-2.4.so.2 => /usr/lib/x86_64-linux-gnu/libldap-2.4.so.2 
(0x00007fef25b0d000)
         liblber-2.4.so.2 => /usr/lib/x86_64-linux-gnu/liblber-2.4.so.2 
(0x00007fef258fe000)
         libsasl2.so.2 => /usr/lib/x86_64-linux-gnu/libsasl2.so.2 
(0x00007fef256e1000)
         libcrypt.so.1 => /lib/x86_64-linux-gnu/libcrypt.so.1 
(0x00007fef254aa000)
         libresolv.so.2 => /lib/x86_64-linux-gnu/libresolv.so.2 
(0x00007fef25293000)
         libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fef24ee7000)
         libgnutls-deb0.so.28 => 
/usr/lib/x86_64-linux-gnu/libgnutls-deb0.so.28 (0x00007fef24bc7000)
         libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 
(0x00007fef249aa000)
         libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fef247a5000)
         /lib64/ld-linux-x86-64.so.2 (0x000055776c6be000)
         libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007fef2458a000)
         libp11-kit.so.0 => /usr/lib/x86_64-linux-gnu/libp11-kit.so.0 
(0x00007fef24344000)
         libtasn1.so.6 => /usr/lib/x86_64-linux-gnu/libtasn1.so.6 
(0x00007fef2412f000)
         libnettle.so.4 => /usr/lib/x86_64-linux-gnu/libnettle.so.4 
(0x00007fef23efd000)
         libhogweed.so.2 => /usr/lib/x86_64-linux-gnu/libhogweed.so.2 
(0x00007fef23cce000)
         libgmp.so.10 => /usr/lib/x86_64-linux-gnu/libgmp.so.10 
(0x00007fef23a4a000)
         libffi.so.6 => /usr/lib/x86_64-linux-gnu/libffi.so.6 
(0x00007fef23842000)
```

with strace : http://paste.debian.net/971168/

And no result with:
```
for file in `ldd /usr/local/bin/ldapadd |awk '{print $3}'`; do echo 
$file && objdump -T $file |grep 2044; done
```
(and nothing in /var/log/slapd.log)

I should try to install it from a fresh install, it would be easier. 
Anyway, as you said, if installing it from a fresh install did not gave 
you this bug, I think you can close it...

In the meantime, I am working with two other servers for production. As 
it works for me now (with hdb), I think I will remove the 2 others. This 
bug already took me too many time.

Best regards,
Rémy

-- 
Rémy Dernat
Ingénieur d'Etudes
MBB/ISE-M