[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#4361) bdb autorecovery didn't



Apparently opening an environment with DB_JOINENV isn't safe if the 
environment has stale locks.
This kind of ties in with ITS#4362, we may just need to quit using the 
DB_JOINENV flag.

We use this option to try to retrieve the flags that were used on the 
existing environment. The main reason it's important is because the 
environment left by the slaptools "quick" mode isn't recoverable, since 
none of the transaction logging options are present there. But BDB 
doesn't return an error if you try to recover in this situation, it just 
wipes out the old environment. Any data that was in the cache but not 
flushed out is just discarded. We try to prevent slapd from running in 
this case, since the database is basically useless then, but the only 
way to detect it is with the DB_JOINENV option.

richton@nbcs.rutgers.edu wrote:
> Full_Name: Aaron Richton
> Version: 2.3.18
> OS: Solaris 9
> URL: 
> Submission from: (NULL) (68.197.28.208)
>
>
> Following ITS#4360's kill -9, autorecovery didn't.
>
> current (and only) thread: t@1
>   [1] ___lwp_mutex_lock(0xff050000, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfed1f950
> =>[2] __db_pthread_mutex_lock(dbenv = 0x38d620, mutexp = 0xff050000), line 220
> in "mut_pthread.c"
>   [3] __db_e_attach(dbenv = 0x38d620, init_flagsp = 0xffbff294), line 290 in
> "env_region.c"
>   [4] __dbenv_open(dbenv = 0x38d620, db_home = 0x2ff260
> "/var/openldap-data/rci", flags = 262208U, mode = 384), line 207 in
> "env_open.c"
>   [5] bdb_db_open(be = 0x2f5e40), line 227 in "init.c"
>   [6] backend_startup_one(be = 0x2f5e40), line 212 in "backend.c"
>   [7] backend_startup(be = 0x2f5e40), line 301 in "backend.c"
>   [8] slap_startup(be = (nil)), line 250 in "init.c"
>   [9] main(argc = 6, argv = 0xffbffbbc), line 791 in "main.c"
>
> truss shows:
> lwp_mutex_lock(0xFF050000)      (sleeping...)
>
> this has been at least 10 minutes, I can't imagine it's going anywhere.
>
> -d -1, unsurprisingly:
> bdb_db_open: dc=rci,dc=rutgers,dc=edu
> bdb_db_open: unclean shutdown detected; attempting recovery.
> bdb_db_open: dbenv_open(/var/openldap-data/rci)
>
> killed off slapd again, db_recover DID work:
> # /usr/local/db4/bin/db_recover  -v
> db_recover: Finding last valid log LSN: file: 2 offset 68073756
> db_recover: Recovery starting from [2][68072423]
> db_recover: Recovery complete at Sat Jan 21 19:00:40 2006
> db_recover: Maximum transaction ID 800075f1 Recovery checkpoint [2][68074997]
>
> [...and then...]
> bdb_db_open: Recovery needed but environment is missing - assuming recovery was
> done manually...
>
> Well, at least that part works...
>
>
>
>
>   


-- 
  -- Howard Chu
  Chief Architect, Symas Corp.  http://www.symas.com
  Director, Highland Sun        http://highlandsun.com/hyc
  OpenLDAP Core Team            http://www.openldap.org/project/