Full_Name: Hallvard B Furuseth Version: 2.4.32 OS: Linux x86_64 URL: Submission from: (NULL) (193.69.163.163) Submitted by: hallvard libmdb omits necessary checks for system call errors, and does not always clean up after the errors it does detect. Also mdb could catch more user errors - including "cannot happen" mutex errors from mutex lock/unlock, since some threading user errors are be non-obvious from the doc. At least sem_wait() needs an EINTR loop. POSIX says "The sem_wait() function is interruptible by the delivery of a signal." The Linux manpage adds that a signal handler always interrupts it regardless of sigaction SA_RESTART. I don't know about BSD. Some error checks first need updates to the Windows emulation, or will look cleaner that way. E.g. pthread_key_create(). I'll fix mdb_env_open()'s error cleanup, at least. (Hyc, that moves cleanup out of mdb_env_setup_locks(). Any preference for using ~25 exit points 'return <ErrCode() or rc>' vs. ~25 'goto <failure label>' which does the same?) Maybe add a flag to request error paranoia, like setting PTHREAD_MUTEX_ERRORCHECK. Trying to catch if a current mdb process has lost is locks. (The fcntl F_SETLK is lost if the process closes another file descriptor to the same file, e.g. after some other module in the process opened and closed the mdb lock file.) Catching a change to the pid from getpid(). Regarding the pid: Repeating getpid() - in mdb_txn_renew0() and mdb_env_close() - is useless since mdb cannot survive fork() anyway. fork drops fcntl locks. Might as well store the pid (or just some unique "environment id") in MDB_env and use that. OTOH some strategically placed getpid()s can then catch the user error of having forked, and set e.g. MDB_FATAL_ERROR. The fork() restriction needs to be documented, as does "don't open or close the same database twice in the same process". mdb_env_open() should catch being called twice. Except, should it try to allow a 2nd call if 1st mdb_env_open() failed? It can easily do that if 1st attempt failed before reading params from the DB files, but fail if the 1st open got that far.
I wrote: > Also mdb could catch more user errors - including "cannot happen" > mutex errors from mutex lock/unlock, since some threading user > errors are be non-obvious from the doc. That is, "cannot happen" when there is no user error. > The fork() restriction needs to be documented, as does "don't > open or close the same database twice in the same process". Correction, "don't have the same database open twice at the same time in the same process".
I messed up last reply headers somehow. Trying again. I wrote: > Also mdb could catch more user errors - including "cannot happen" > mutex errors from mutex lock/unlock, since some threading user > errors are be non-obvious from the doc. That is, "cannot happen" when there is no user error. > The fork() restriction needs to be documented, as does "don't > open or close the same database twice in the same process". Correction, "don't have the same database open twice at the same time in the same process".
h.b.furuseth@usit.uio.no wrote: > Full_Name: Hallvard B Furuseth > Version: 2.4.32 > OS: Linux x86_64 > URL: > Submission from: (NULL) (193.69.163.163) > Submitted by: hallvard > > > libmdb omits necessary checks for system call errors, and does > not always clean up after the errors it does detect. > > Also mdb could catch more user errors - including "cannot happen" > mutex errors from mutex lock/unlock, since some threading user > errors are be non-obvious from the doc. > > At least sem_wait() needs an EINTR loop. POSIX says "The > sem_wait() function is interruptible by the delivery of a signal." > The Linux manpage adds that a signal handler always interrupts it > regardless of sigaction SA_RESTART. I don't know about BSD. > > Some error checks first need updates to the Windows emulation, or > will look cleaner that way. E.g. pthread_key_create(). > > I'll fix mdb_env_open()'s error cleanup, at least. > (Hyc, that moves cleanup out of mdb_env_setup_locks(). Any > preference for using ~25 exit points 'return <ErrCode() or rc>' > vs. ~25 'goto <failure label>' which does the same?) In general I prefer that functions have only one exit point, so multiple gotos is preferable. > Maybe add a flag to request error paranoia, like setting > PTHREAD_MUTEX_ERRORCHECK. Trying to catch if a current mdb > process has lost is locks. (The fcntl F_SETLK is lost if the > process closes another file descriptor to the same file, > e.g. after some other module in the process opened and closed the > mdb lock file.) Catching a change to the pid from getpid(). That's really not interesting to me. No other module has any business opening and reading our lock file. We haven't bothered with this for alock either, and it hasn't been a problem. Change of pid... not worth bothering. Just add a note to the docs. > Regarding the pid: Repeating getpid() - in mdb_txn_renew0() and > mdb_env_close() - is useless since mdb cannot survive fork() > anyway. fork drops fcntl locks. Might as well store the pid (or > just some unique "environment id") in MDB_env and use that. Agreed. I had a patch here that does just that, seems it got lost in some other edits. > OTOH > some strategically placed getpid()s can then catch the user error > of having forked, and set e.g. MDB_FATAL_ERROR. Not really interesting. > The fork() restriction needs to be documented, as does "don't open > or close the same database twice in the same process". Go ahead. > mdb_env_open() should catch being called twice. Except, should > it try to allow a 2nd call if 1st mdb_env_open() failed? It can > easily do that if 1st attempt failed before reading params from > the DB files, but fail if the 1st open got that far. Called twice on the same env handle? Again, not interesting. As a rule, we should of course check for failure return codes from normal operation. We should not add extensive checks for programmer error/misuse of API. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
Howard Chu wrote: >h.b.furuseth@usit.uio.no wrote: >> Maybe add a flag to request error paranoia, like setting >> PTHREAD_MUTEX_ERRORCHECK. Trying to catch if a current mdb >> process has lost is locks. (The fcntl F_SETLK is lost if the >> process closes another file descriptor to the same file, >> e.g. after some other module in the process opened and closed the >> mdb lock file.) Catching a change to the pid from getpid(). > > That's really not interesting to me. No other module has any business > opening and reading our lock file. I skipped a step there. The other module could also be using mdb. "other module" = "a program component which the module using this MDB_env does not know about". They may even be linking different libmdb.so's, so the 'libmdb's can't notify each other via static vars. There are a few other use cases like a monolith program which also can backup a directory - which the user could point at the mdb dir. Anyway, a CAVEATS section in the doc is the obvious fix. > We haven't bothered with this for alock either, and > it hasn't been a problem. Note that this ITS is about libmdb, not back-mdb. >> The fork() restriction needs to be documented, as does "don't open >> or close the same database twice in the same process". > > Go ahead. Hmm. I suggested a too pessimistic caveats section once, I can dig that up and try again.
changed notes changed state Open to Test moved from Incoming to Software Bugs
changed notes changed state Test to Partial
Changes from a failed liblmdb operation can be visible. Sometimes the resulting DB will also be inconsistent. The operation should invalidate the affected transaction/cursors or revert the change. At least I hope it's feasible to invalidate so the user cannot see surprising results, instead of just documenting "don't do that". Related issues: put(MDB_MULTIPLE) can write some of the provided items and then fail, but be able to keep the txn usable. It should invalidate the txn anyway, or document that the caller must check for e.g. return value MDB_PARTIAL_SUCCESS. Maybe invalidate unless the the caller requests support for partial success. put() would update the input params to indicate which items remain unwritten. A failed txn could still grow the txn size, unless MDB "un-touches" pages which memcmp says have not changed, rewinds me_pglast and me_pghead, etc. Which seems a lot of work, likely not worth the trouble. I think cursors need a C_INVALID flag, so future ops cannot give a surprising success result instead of failing: get(MDB_NEXT/MDB_PREV) take ~C_INITIALIZED to mean (re)start from the beginning/end. Don't know how often reverting is a relevant option. Rearranging code can help, but maybe not much. E.g. the mdb_del() in mdb_drop() can do some harmless touches and then fail cleanly - but mdb_drop() must invalidate the txn anyway since mdb_drop0() has deleted pages. Unless it does extra work - remember mt_free_pgs[0] before and after mdb_drop0(), then delete from mt_free_pgs[] the pages added by drop0. Or drop0 could be done last since it can fail cleanly just by resetting mt_free_pgs[0]. But I don't know if del(MDB_MULTIPLE) can also do that if the DB used a subpage. drop0 can't examine the subpage after it got deleted. -- Hallvard
h.b.furuseth@usit.uio.no wrote: > Changes from a failed liblmdb operation can be visible. Sometimes > the resulting DB will also be inconsistent. The operation should > invalidate the affected transaction/cursors or revert the change. The transaction should be invalidated. Nothing more can be done once any error occurs in a transaction. > At least I hope it's feasible to invalidate so the user cannot see > surprising results, instead of just documenting "don't do that". > > Related issues: > > put(MDB_MULTIPLE) can write some of the provided items and then > fail, but be able to keep the txn usable. It should invalidate the > txn anyway, or document that the caller must check for e.g. return > value MDB_PARTIAL_SUCCESS. Maybe invalidate unless the the caller > requests support for partial success. put() would update the input > params to indicate which items remain unwritten. No, "partial success" is not allowed. A transaction is atomic, all or nothing. > A failed txn could still grow the txn size, unless MDB "un-touches" > pages which memcmp says have not changed, rewinds me_pglast and > me_pghead, etc. Which seems a lot of work, likely not worth the > trouble. pghead/pglast should not persist into the environment until a successful commit. Until then any error should just discard them. > > I think cursors need a C_INVALID flag, so future ops cannot give a > surprising success result instead of failing: get(MDB_NEXT/MDB_PREV) > take ~C_INITIALIZED to mean (re)start from the beginning/end. Good idea. > Don't know how often reverting is a relevant option. Rearranging > code can help, but maybe not much. E.g. the mdb_del() in mdb_drop() > can do some harmless touches and then fail cleanly - but mdb_drop() > must invalidate the txn anyway since mdb_drop0() has deleted pages. > Unless it does extra work - remember mt_free_pgs[0] before and after > mdb_drop0(), then delete from mt_free_pgs[] the pages added by > drop0. Or drop0 could be done last since it can fail cleanly just > by resetting mt_free_pgs[0]. But I don't know if del(MDB_MULTIPLE) > can also do that if the DB used a subpage. drop0 can't examine the > subpage after it got deleted. > Reverting is irrelevant. Once a txn is aborted no changes are visible. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
Howard Chu writes: > h.b.furuseth@usit.uio.no wrote: >> Changes from a failed liblmdb operation can be visible. Sometimes >> the resulting DB will also be inconsistent. The operation should >> invalidate the affected transaction/cursors or revert the change. > > The transaction should be invalidated. Nothing more can be done > once any error occurs in a transaction. That would break back-mdb for MDB_NOTFOUND/MDB_KEYEXIST. And at least MDB_KEYEXIST can happen after touching pages, which affects the txn size even if the change "does nothing". I expected to generally invalidate except in necessary cases, and from then on make more cases "gentle" at our convenience without spending much time on it. >> put(MDB_MULTIPLE) can write some of the provided items and then >> fail, but be able to keep the txn usable. (...) > > No, "partial success" is not allowed. A transaction is atomic, all > or nothing. It would be atomic like a partial OS write(). It'd be up to the user what to do next. Not that I care myself if it is supported. >> A failed txn could still grow the txn size, unless MDB "un-touches" >> pages which memcmp says have not changed, rewinds me_pglast and >> me_pghead, etc. Which seems a lot of work, likely not worth the >> trouble. > > pghead/pglast should not persist into the environment until a successful > commit. Until then any error should just discard them. Sorry, I meant a failed _operation_ could grow them and thus affect the committed freelist size. >> I think cursors need a C_INVALID flag, so future ops cannot give a >> surprising success result instead of failing: get(MDB_NEXT/MDB_PREV) >> take ~C_INITIALIZED to mean (re)start from the beginning/end. > > Good idea. ...or C_VALID so a single OR/AND can set/clear them both. >> Don't know how often reverting is a relevant option. (...) > > Reverting is irrelevant. Once a txn is aborted no changes are visible. True, but I meant while the txn is still live and usable. Hopefully it's only relevant for "nice to have" cases. -- Hallvard
I wrote: > at > least MDB_KEYEXIST can happen after touching pages, which affects > the txn size even if the change "does nothing". Only when updating subDBs, I think. -- Hallvard
Some fixes in mdb.master. Some fixes in master Some fixes in RE24