[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#7377) Poor libmdb error handling

h.b.furuseth@usit.uio.no wrote:
> Full_Name: Hallvard B Furuseth
> Version: 2.4.32
> OS: Linux x86_64
> URL: 
> Submission from: (NULL) (
> Submitted by: hallvard
> libmdb omits necessary checks for system call errors, and does
> not always clean up after the errors it does detect.
> Also mdb could catch more user errors - including "cannot happen"
> mutex errors from mutex lock/unlock, since some threading user
> errors are be non-obvious from the doc.
> At least sem_wait() needs an EINTR loop.  POSIX says "The
> sem_wait() function is interruptible by the delivery of a signal."
> The Linux manpage adds that a signal handler always interrupts it
> regardless of sigaction SA_RESTART.  I don't know about BSD.
> Some error checks first need updates to the Windows emulation, or
> will look cleaner that way.  E.g. pthread_key_create().
> I'll fix mdb_env_open()'s error cleanup, at least.
> (Hyc, that moves cleanup out of mdb_env_setup_locks().  Any
> preference for using ~25 exit points 'return <ErrCode() or rc>'
> vs. ~25 'goto <failure label>' which does the same?)

In general I prefer that functions have only one exit point, so multiple gotos
is preferable.

> Maybe add a flag to request error paranoia, like setting
> PTHREAD_MUTEX_ERRORCHECK.  Trying to catch if a current mdb
> process has lost is locks. (The fcntl F_SETLK is lost if the
> process closes another file descriptor to the same file,
> e.g. after some other module in the process opened and closed the
> mdb lock file.)  Catching a change to the pid from getpid().

That's really not interesting to me. No other module has any business opening
and reading our lock file. We haven't bothered with this for alock either, and
it hasn't been a problem.

Change of pid... not worth bothering. Just add a note to the docs.

> Regarding the pid: Repeating getpid() - in mdb_txn_renew0() and
> mdb_env_close() - is useless since mdb cannot survive fork()
> anyway.  fork drops fcntl locks.  Might as well store the pid (or
> just some unique "environment id") in MDB_env and use that.

Agreed. I had a patch here that does just that, seems it got lost in some
other edits.

> some strategically placed getpid()s can then catch the user error
> of having forked, and set e.g. MDB_FATAL_ERROR.

Not really interesting.

> The fork() restriction needs to be documented, as does "don't open
> or close the same database twice in the same process".

Go ahead.

> mdb_env_open() should catch being called twice.  Except, should
> it try to allow a 2nd call if 1st mdb_env_open() failed?  It can
> easily do that if 1st attempt failed before reading params from
> the DB files, but fail if the 1st open got that far.

Called twice on the same env handle? Again, not interesting.

As a rule, we should of course check for failure return codes from normal
operation. We should not add extensive checks for programmer error/misuse of API.

  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/