[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: syncrepl loosing connection





--On October 7, 2009 10:53:27 AM +0200 Peter Mogensen <apm@mutex.dk> wrote:

Hi,

I found the reason that slapd was hanging at startup. It turned out to be
a schema, which hadn't been properly replicated after being dynamicly
added.
So not replication is actually moving entries. However... it seems to
constantly loose connection (which may be why the schema sometimes fails
to replicate on load).

The setup is 2 mirrormode servers (slapd 2.4.17). Server 1 has the
database and is trying to replicate it to Server 2 which was empty from
start.

I have syncrepl for both cn=config and for the actual database.
Which means that I should see 4 connections (2 each way) between server 1
and 2. But the last connection (server2->server1) seems to open and close
constantly.

On server 2 I see repeated:

Oct  7 09:47:14 s02 slapd[26723]: do_syncrepl: rid=001 rc -1 retrying
Oct  7 09:47:28 s02 slapd[26723]: do_syncrep2: rid=003 (-1) Can't contact
LDAP server
Oct  7 09:47:28 s02 slapd[26723]: do_syncrepl: rid=003 rc -1 retrying
Oct  7 09:48:49 s02 slapd[26723]: do_syncrep2: rid=003 (-1) Can't contact
LDAP server

When Adding olcLogLevel: conns sync trace none I se the logmessages I
would expect mixed with a lot of these:

Oct  7 10:41:52 s02 slapd[26723]: slap_sl_malloc of 48 bytes failed,
using ch_malloc
Oct  7 10:41:52 s02 slapd[26723]: slap_sl_malloc of 40 bytes failed,
using ch_malloc


... coming in burts with varying number of bytes. However, the machine
doesn't look like it's running out of mem.

Unable to malloc means your system is running out of memory.  That's bad.

--Quanah


--

Quanah Gibson-Mount
Principal Software Engineer
Zimbra, Inc
--------------------
Zimbra ::  the leader in open source messaging and collaboration