[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: So I finally upgraded from slurpd...

> On Tue, 20 Jul 2010, masarati@aero.polimi.it wrote:
>> > It turned out that the object cn=admin,dc=foo,dc=no had multiple
>> > occurances of "objectClass: organizationalRole" (!), and this also
>> > prevented syncrepl from working. I suspect it was a result of "manual"
>> > editing of ldif files followed by an import using slapadd. I get no
>> > warnings from slapadd when I import import objects with multiple
>> > occurances of the same objectClass.
>> >
>> > Perhaps slapadd/slapd should be able to deal with such duplicate
>> > entries better, to make it more obivous what's wrong? I'm just saying
>> > :)
>> slapd(8) can handle those occurrences.
> But does it handle it good enough, when it prevents replsync from working?

This is a side effect: the replica receives bogus data via the protocol,
and spits it.

>> slapadd(8) is intended to load LDIF files generated by slapcat(8), thus
>> presumably consistent.
> And the file was indeed LDIF file generated by slapcat.

I mean: from slapcat of a sane database.

> Since slapd allows
> it, slapcat will also spit it out - when slapcat, slapadd and slapd all
> "handle it" without giving any warnings back to anyone, it's not so easy
> to detect errors.

No, you miss one link: slapd did not handle it (I mean: through protocol).
 When slapd starts up and opens a database, it does not validate its
content, of course.  And when it returns an entry, it does not validate
its contents.  Only when a write is performed, the contents are validated
(usually, only the bit that's being written, if it's a modify).

>> In general, it deals with the most obvious errors.  I don't think asking
>> slapadd to perform these checks is a good idea, as it would slow it down
>> without real benefit: if an error is caught, you would need to restart,
>> wasting all the actual write effort.
> I don't quite agree - as I understand it slapadd already does some sanity
> checking, how much overhead would a check for objectClass doublets imply?

Why don't you code and test it yourself?  Checking for duplicates requires
to normalize data and compare each value to eachother.  A wise
implementation has quadratic cost (n*(n-1)/2 comparisons).  You were
offended by a duplicate objectClass issue this time.  If next time it
happens to a group with 10,000 members, you'll be whining that your groups
are perfectly sane, why does it take so long to load your LDIF?

> And I dont see why you would need to restart, on a doublet either spit out
> a warning, or even better - spit out a warning and discard the doublet.

Those are implementation details; in many cases, the database needs to be
complete - no holes; so if slapadd spits an entry, it may not be able to
add its children.

>> A sanity check tool for unreliable LDIF would probably be more
>> appropriate.  I guess at this point most users would pretend their LDIF
>> is always reliable, and avoid running the sanity checker...
> Really? Yes, I would love a sanity checker, and I would most likely
> _always_ run LDIF through a sanity checker before using slapadd to write
> to back-end.
> But again - slapadd already does some sanity checking,

Usually, as much as it's strictly required to properly perform its own
task - regenerate a presumably sane database.

> and there's even a
> flag for "dry-run" mode (-u) which IMO says that it is supposed to be used
> as a sanity checking tool. I'm perfectly OK to let _all_ sanity checks
> only occure when using -u.

Embedding the sanity checker in slapadd is an option, indeed.  Not the
default, IMHO.

> I would love to dump all my ldap data to an LDIF and run it through a
> sanity checker, I suspect there's more "old noise" stuck in there.

Task separation is at the roots of clean programming - and system