[Date Prev][Date Next] [Chronological] [Thread] [Top]

Lock is no longer valid / deferring operation



Hi all,

For largely historical reasons we run slapd servers on most clients
(this will probably change in the future - I'm just giving this
information as background).  We're seeing problems when some of these
machines are busy, particularly, it seems, with memory intensive
activity, although it's hard to substantiate as I generally only see
the machines after they've broken.  It's annoying as I can't reproduce
these problems.

We see quite a few problems with slapd getting into a state where it's
deferring operations, for whatever reason - I think I understand these
- these are when slapd basically says sorry, I'm too busy doing X, so
I'll defer Y until I have time.  Is this accurate?

The second case I'm also seeing is bdb complaining about locks being
no longer valid, e.g.

slapd[3780]: bdb(dc=inf,dc=ed,dc=ac,dc=uk): DB_LOCK->lock_put: Lock is no longer valid

slapd seems to keep going for the time being until getting into a
state where it defers all binding operations and goes into some kind
of spin where it sits at 99% cpu and has to be killed with a -9.

I suppose I have a couple of questions about the "Lock is no longer
valid" error....

- What causes it?
- Is it something I can prevent by configuration changes (for
  instance, would increasing the numbers of locks, lockers and objects
  help?)

We're running openldap 2.3.35 with ITS#4924 and ITS#4925 patches with
a bdb backend running 4.2.52 with all 6 recommended patches.

The only DBCONFIG settings we currently have are:

dbconfig      set_cachesize 0 67108864 1
dbconfig      set_lg_regionmax 262144
dbconfig      set_lg_bsize 2097152

Thanks in advance
Toby Blake
School of Informatics
University of Edinburgh