[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: index corruption problem (ITS#1349)



Markus.Storm@mediaWays.net wrote:
> 
> masarati@aero.polimi.it wrote:
> >
> > Markus.Storm@mediaWays.net wrote:
> > >
> > > Some more info:
> > >
> > > I think we tracked it down to a script of ours that does a
> > > modrdn on "uid=testuser". At times there are multiple instances
> > > of this script running.
> > >
> > > Is modrdn threading-safe  ?
> >
> > It should. One problem of that function is it is not transactional.
> > This means that a failure between a certain set of operations will
> > leave the dn and the index spaces in a dangling state. I've been
> > working on that for a while, but as the problem will be outdated
> > by back-bdb and modrdn should be considered an administrative
> > operation, to be performed with extra care, I didn't go too deep
> > into it. Can you detail to what extent the multiple instances of
> > the entry coexist? I mean, do they show up when looking for
> > a(n indexed) value they share or when dumping (ldapsearch or slapcat)
> > the db?
> >
> 
> All existant entries always showed up when querying for the value they
> share. None of them was ever missing.
> But this query was slow. It was about the same speed as if querying for
> a non-indexed attribute (sorry I didn't trace slapd at the time so I
> cannot say for sure whether it was searching id2entry).
> 
> Another strange occurrence was the following:
> All the entries that share this value live in different subtrees, i.e.
> there's just one entry with this value per subtree.
> We were querying for the shared value, but using the subtree root as a
> search base, so there should have only been one result.
> Instead, slapd returned this entry (but none of those in the other subtrees)
> multiple times.

I noticed once the occurrence of this; the only explanation I found,
which was then fixed in 2.0.12 (I think), was a malformed check
on the return status of some low-level ldbm routines. This could
have lead to multiple creations of the same entry, but now, apart
threading problems I'm unaware of, should be fixed.

> Sorry I cannot provide you with any traces. We didn't have the time to
> analyze things when they happened.
> Maybe writing a script that does concurrent MODRDNs can reproduce
> our situation ?

As I wrote you, there are chances the indices get polluted,
including dn2id and id2entry, I guess, if the modify operation
following the rdn modification fails (when the rdn is modified,
the values used in the rdn are added to the related attributes 
also in the entry, and a special flag may also require the deletion 
of the old values).
This operation may fail for a number of reasons, including acl
checking, schema checking and so. This is what I mean when I say
the operation should be considered unsafe, because there's no
going back and the entry (and the index files) are left in a
dangling state.

Note that concurrent modrdn calls are "nonsense", because
as soon as one succeeds, the following ones will not find
the entry they try to modify.

Pierangelo.


-- 
Dr. Pierangelo Masarati		mailto:pierangelo.masarati@sys-net.it
LDAP Architect, SysNet s.n.c.	http://www.sys-net.it