[Date Prev][Date Next] [Chronological] [Thread] [Top]

RE: Slapindex corrupts BDB (ITS#2499)



> -----Original Message-----
> From: owner-openldap-bugs@OpenLDAP.org
> [mailto:owner-openldap-bugs@OpenLDAP.org]On Behalf Of quanah@stanford.edu

> Full_Name: Quanah Gibson-Mount
> Version: 2.1.18+indexing patch
> OS: Solaris 8
> URL: ftp://ftp.openldap.org/incoming/
> Submission from: (NULL) (171.64.19.82)

> Hello,
>
> Yesterday, I modified my slapd.conf to index one new
> attribute, and to also add
> a substr index to another attribute.  I then stopped slapd,
> and ran 'time
> slapindex'.  I left it running for 6.5 hours (note that to
> dump my DB and then
> load it, takes less than 4 hours), somepoint of which it
> appeared to go into
> some sort of spin, as it hadn't touched any of the *.bdb
> files in a long time,
> and had added over 500MB in size to my __db.* files (as
> compared to a DB in
> which I did slapcat/slapadd with the same indices).  I
> finally killed it, as it
> was apparent it was getting nowhere.  After it was killed,
> the DB was left in a
> state where db_recover could not recover it, and a slapcat
> dumped a small
> portion of the DB and then hung.

It is normal for the *.bdb files' timestamps to remain unchanged, since
changes are first applied to the BDB cache. (This is file __db.002 in the BDB
environment.) It is also normal for the cache file to grow, since it is
created as a sparse file and the updates fill it up over time.

The extremely slow runtime is due to thrashing disk activity when the BDB
environment resides on the same disk as the database files. BDB always reads
pages from the database files into the cache before working on them. Since
the cache is on disk, this equates to a read and a write to the same disk for
each data page.

Normally if the cache is already fully populated, or new data is being
created, this excess I/O does not occur. So when you're running slapadd,
which creates the data, the new data is written to the cache, and nothing
else happens to the actual index database files until a checkpoint is
performed. When the cache has been primed in this manner, slapindex shouldn't
cause any excess I/Os either - most of the entry data it needs will be in the
BDB cache already, as well as any existing index data. But if you've recently
done a db_recover, the cache is reset, so every data access will first be
copied from the underlying database files into the cache, before the
operation completes.

Since I found no evidence that slapindex itself was causing any data
corruption, this ITS will be closed.

  -- Howard Chu
  Chief Architect, Symas Corp.       Director, Highland Sun
  http://www.symas.com               http://highlandsun.com/hyc
  Symas: Premier OpenSource Development and Support