[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: OpenLDAP syncrepl woes



Jeffrey Crawford wrote:
On Wed, Nov 16, 2011 at 1:27 PM, Howard Chu <hyc@symas.com
<mailto:hyc@symas.com>> wrote:
    Jeffrey Crawford wrote:
        On Wed, Nov 16, 2011 at 7:40 AM, Jeffrey Crawford<jeffreyc@ucsc.edu
        <mailto:jeffreyc@ucsc.edu>>  wrote:
            On Wed, Nov 16, 2011 at 12:09 AM, Howard Chu<hyc@symas.com
            <mailto:hyc@symas.com>>  wrote:
                Jeffrey Crawford wrote:
                    I'm trying to stabilize our openldap server farm before
                    going live and
                    am finding that despite the contextCSN matching between
                    providers and
                    replicas, the actual content of the server is getting out
                    of sync.
                    This is most prominent when we are testing our population
                    routine and
                    we need to remove all accounts before starting. right now
                    it's only
                    about 22000 entries (It will get much larger).

                    During the mass delete we got the following sprinkled
                    throughout the
                    logs on all machines:
                    ====
                    Nov 15 15:47:16 idm-prod-ldap-2 slapd[33070]:
                    bdb(dc=domain,dc=name):
                    previous transaction deadlock return not resolved


                Wow. I've never seen this error message before. What version
                of OpenLDAP and
                BerkeleyDB are you using?


            FreeBSD 8.2 with openldap 2.4.26, however like I mentioned before,
            right now I think we are squeezing ram right now Part of this
            deployment was to discover how much ram we needed on the virtual
            machine and it was started pretty low.


        Oh and we are using bdb 4.6 right now (forgot to answer that)


    Running out of memory would cause an obvious error message ("no memory")
    so that's not likely to be the problem here. Might be worth upgrading to
    at least BDB 4.8, but again, never having seen BDB spit out that error
    before, that's just a guess.

Not sure if this is significant but I'm been noticing that this error only
shows up on deletes. However it also shows up on deletes on the machine I'm
running the ldapdelete against. So perhaps this is more of a software issue.
I'll go ahead and run this with more ram and I'll check with the sysadmin if
they can compile it against bdb 4.8 and see if that changes anything. But I
don't think ITS#7052 applies here because the machine I'm doing this against
does not use syncrepl, its the provider to others.

This is a machine on a VM. Are there any known issues with that?

Way back in the dawn of time, there were some VMware implementations that didn't support mutexes correctly. I don't think that's been an issue for many years. There ought to be other error messages in your log, immediately preceding the one you quoted. Post those too.

--
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/