[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
(ITS#8387) online olcDbConfig change fails with syncprov
Full_Name: Ryan Tandy
Version: 2.4, master
OS: Debian
URL:
Submission from: (NULL) (24.68.37.4)
Submitted by: ryan
Forwarding from a Debian bug report:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=816294
Configure a BDB or HDB database with syncprov:
dn: olcDatabase={1}hdb,cn=config
objectClass: olcHdbConfig
olcDbDirectory: data
olcSuffix: dc=example,dc=comD%D
dn: olcOverlay={0}syncprov,olcDatabase={1}hdb,cn=config
objectClass: olcSyncProvConfig
perform some kind of modification to the database (so that a syncprov checkpoint
is pending), and perform an online olcDbConfig change that reopens the
database.
dn: olcDatabase={1}hdb,cn=config
changetype: modify
replace: olcDbConfig
olcDbConfig: set_cachesize 1 0 1
Reopening the database fails:
56e76e17 bdb(dc=example,dc=com): BDB4511 Error: closing the transaction region
with active transactions
56e76e17 bdb_db_close: database "dc=example,dc=com": close failed: Invalid
argument (22)
and slapd crashes shortly after, when it tries to syncprov_checkpoint while the
database is already gone.
What appears to be happening is that "ctx" is different between bdb_reader_get
and bdb_reader_flush in this case.
During a normal slapd startup and shutdown:
(gdb) thread apply all frame
Thread 1 (Thread 0x7ffff7fed700 (LWP 21570)):
#0 hdb_reader_get (op=0x7fffffffd8d0, env=0xa93fa0, txn=0x7fffffffd610) at
cache.c:1666
1666 if ( !ctx ) {
(gdb) p ctx
$1 = (void *) 0x8b34a0 <ldap_int_main_thrctx>
[ ... killall slapd ... ]
(gdb) thread apply all frame
Thread 1 (Thread 0x7ffff7fed700 (LWP 21570)):
#0 hdb_reader_flush (env=0xa93fa0) at cache.c:1643
1643 if ( !ldap_pvt_thread_pool_getkey( ctx, env, &data, NULL ) ) {
(gdb) p ctx
$2 = (void *) 0x8b34a0 <ldap_int_main_thrctx>
In this case, the readers are cleared correctly.
Another startup, this time the hdb_db_close is triggered by performing an
olcDbConfig change:
(gdb) thread apply all frame
Thread 1 (Thread 0x7ffff7fed700 (LWP 21624)):
#0 hdb_reader_get (op=0x7fffffffd8d0, env=0xa93fa0, txn=0x7fffffffd610) at
cache.c:1666
1666 if ( !ctx ) {
(gd2929 p ctx
$1 = (void *) 0x8b34a0 <ldap_int_main_thrctx>
[ ... ldapmodify ... ]
(gdb) thread apply all frame
Thread 3 (Thread 0x7ffff362e700 (LWP 21633)):
#0 hdb_reader_flush (env=0xa93fa0) at cache.c:1643
1643 if ( !ldap_pvt_thread_pool_getkey( ctx, env, &data, NULL ) ) {
Thread 2 (Thread 0x7ffff3e2f700 (LWP 21631)):
#0 0x00007ffff732d4d3 in epoll_wait () at
../sysdeps/unix/syscall-template.S:84
84 ../sysdeps/unix/syscall-template.S: No such file or directory.
Thread 1 (Thread 0x7ffff7fed700 (LWP 21624)):
#0 0x00007ffff75f06dd in pthread_join (threadid=140737285125888,
thread_return=0x0) at pthread_join.c:90
90 pthread_join.c: No such file or directory.
(gdb) p ctx
$2 = (void *) 0x7ffff362dbf0
This time we have a different ctx, so the readers are not cleared. This when we
get to db->close there is still an active txn.
The comment "free up any keys used by the main thread" seems to assume
bdb_reader_flush will be called on the main thread only.