[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: index corruption problem (ITS#1349)

Markus.Storm@mediaWays.net wrote:
> masarati@aero.polimi.it wrote:
> >
> > Markus.Storm@mediaWays.net wrote:
> > >
> > > Some more info:
> > >
> > > I think we tracked it down to a script of ours that does a
> > > modrdn on "uid=testuser". At times there are multiple instances
> > > of this script running.
> > >
> > > Is modrdn threading-safe  ?
> >
> > It should. One problem of that function is it is not transactional.
> > This means that a failure between a certain set of operations will
> > leave the dn and the index spaces in a dangling state. I've been
> > working on that for a while, but as the problem will be outdated
> > by back-bdb and modrdn should be considered an administrative
> > operation, to be performed with extra care, I didn't go too deep
> > into it. Can you detail to what extent the multiple instances of
> > the entry coexist? I mean, do they show up when looking for
> > a(n indexed) value they share or when dumping (ldapsearch or slapcat)
> > the db?
> >
> All existant entries always showed up when querying for the value they
> share. None of them was ever missing.
> But this query was slow. It was about the same speed as if querying for
> a non-indexed attribute (sorry I didn't trace slapd at the time so I
> cannot say for sure whether it was searching id2entry).
> Another strange occurrence was the following:
> All the entries that share this value live in different subtrees, i.e.
> there's just one entry with this value per subtree.
> We were querying for the shared value, but using the subtree root as a
> search base, so there should have only been one result.
> Instead, slapd returned this entry (but none of those in the other subtrees)
> multiple times.

I noticed once the occurrence of this; the only explanation I found,
which was then fixed in 2.0.12 (I think), was a malformed check
on the return status of some low-level ldbm routines. This could
have lead to multiple creations of the same entry, but now, apart
threading problems I'm unaware of, should be fixed.

> Sorry I cannot provide you with any traces. We didn't have the time to
> analyze things when they happened.
> Maybe writing a script that does concurrent MODRDNs can reproduce
> our situation ?

As I wrote you, there are chances the indices get polluted,
including dn2id and id2entry, I guess, if the modify operation
following the rdn modification fails (when the rdn is modified,
the values used in the rdn are added to the related attributes 
also in the entry, and a special flag may also require the deletion 
of the old values).
This operation may fail for a number of reasons, including acl
checking, schema checking and so. This is what I mean when I say
the operation should be considered unsafe, because there's no
going back and the entry (and the index files) are left in a
dangling state.

Note that concurrent modrdn calls are "nonsense", because
as soon as one succeeds, the following ones will not find
the entry they try to modify.


Dr. Pierangelo Masarati		mailto:pierangelo.masarati@sys-net.it
LDAP Architect, SysNet s.n.c.	http://www.sys-net.it