[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: bdb 4.6 issue with 2.4.12



On Friday 17 October 2008 21:30:04 Quanah Gibson-Mount wrote:
> --On Friday, October 17, 2008 8:28 PM +0200 Guillaume Rousse
>
> <Guillaume.Rousse@inria.fr> wrote:
> > Quanah Gibson-Mount a Ãcrit :
> >> --On Friday, October 17, 2008 4:22 PM +0200 Guillaume Rousse
> >>
> >> <Guillaume.Rousse@inria.fr> wrote:
> >>> Since I upgraded one of my server from 2.4.11 to 2.4.12, I'm facing
> >>> heavy database issues:
> >>> [root@etoile ~]# slapcat -b dc=msr-inria,dc=inria,dc=fr
> >>> ...
> >>> bdb(dc=msr-inria,dc=inria,dc=fr): pthread lock failed: Invalid argument
> >>> bdb(dc=msr-inria,dc=inria,dc=fr): PANIC: Invalid argument
> >>> bdb(dc=msr-inria,dc=inria,dc=fr): PANIC: DB_RUNRECOVERY: Fatal error,
> >>> run database recovery
> >>> bdb(dc=msr-inria,dc=inria,dc=fr): PANIC: fatal region error detected;
> >>> run recovery
> >>> bdb_db_close: database "dc=msr-inria,dc=inria,dc=fr": close failed:
> >>> DB_RUNRECOVERY: Fatal error, run database recovery (-30975)
> >>>
> >>> Even importing a backup ldiff file on a fresh installation triggers the
> >>> same problems.

I did some basic testing (details on the Mandriva bug report), and the tools 
leave the database in a bad state, but slapd itself does not. Running recovery 
(e.g. between slapcat's) works. slapd startup will also recover the db.

> >>>
> >>> I tested this problem on two different environment (mandriva 2008.1,
> >>> mandriva cooker), and one user reported it against mandriva 2009.0
> >>> (https://qa.mandriva.com/show_bug.cgi?id=45034). This seems to either
> >>> imply an openldap or a packaging issue.

IMHO, OpenLDAP issue. Note, all tests pass with these binaries (and must have 
during the build anyway for the packages to have made it through the build 
system), but none of the tests test whether the tools will mess up a db ...

> >>> Should I report an ITS for
> >>> this, or rather provide more informations ?
> >>
> >> What options was BDB 4.6 compiled with?

The same package that was used for 2.4.11, which did not have *any* problems.

> >> Does it have all the patches
> >> from Oracle?

It has 4.6.21.1, but not the 4.6.21.2 patch (4.6.21.3 is irrelevant of 
course).

> >
> > According to the spec file, there is one oracle and two fedora patches
> > applied:
> >
> > http://svn.mandriva.com/cgi-bin/viewvc.cgi/packages/cooker/db46/current/S
> > PECS/db46.spec?revision=293611&view=markup
> >
> > The exact option list used is a bit more difficult to tell, given the
> > usage of conditional build options, but it seems to be:
> > --enable-shared --enable-static --enable-rpc
> > --enable-cxx
> > --disable-posixmutexes --with-mutex=x86/gcc-assembly (or
> > --with-mutex=x86_64/gcc-assembly for x86_64).

build_asmmutex defaults to 0, so it's actually:

	--enable-shared --enable-static --enable-rpc --enable-cxx --with-
mutex=POSIX/pthreads/library

> I'd suggest rebuilding BDB with:
>
> --enable-posixmutexes --with-mutex=POSIX/pthreads

Which would improve performance, but really shouldn't prevent slap tools from 
closing the database uncleanly.

> and then rebuilding OpenLDAP against the new BDB build, and see if the
> problem persists.

I will enable the internal library copy option on a build of my OpenLDAP 
package, and build against both 4.6 and 4.7 ... but I suspect it will persist.

Regards,
Buchan