[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#7825) LMDB: misleading error message



--001a11c13346e990f604f57e2991
Content-Type: text/plain; charset=UTF-8

Hi,

For further information, see the discussion here:
https://github.com/Venemo/node-lmdb/pull/20
We genuinely thought that there had been something wrong with our value
sizes somewhere, so I wasted several hours debugging in the wrong place.
Then, by accident I discovered that it's in fact an issue with V8's GC.

I definitely suggest to choose one of the solutions that don't cost any
execution time when there is no error.

Perhaps mdb_dbi_close could return an error (or crash) when the dbi is
still used by a cursor. That'd eliminate the issue but it'd also mean an
API/ABI break.

Best regards,
Timur




On Tue, Mar 25, 2014 at 10:37 AM, Hallvard Breien Furuseth <
h.b.furuseth@usit.uio.no> wrote:

> On Sat, 2014-03-22 at 22:49 +0000, timur.kristof@gmail.com wrote:
> > When you operate on a dbi using a cursor and accidentally close the dbi
> before
> > committing the transaction of the cursor, mdb_txn_commit will return
> > MDB_BAD_VALSIZE. This is quite misleading because the problem doesn't
> actually
> > have anything to do with what the error message suggests. ("Too big
> key/data,
> > key is empty, or wrong DUPFIXED size")
> >
> > I suggest to either add a new kind of error code for this situation or
> to extend
> > the description of MDB_BAD_VALSIZE.
>
> Let's see -- here are the options I can see (after an IRC chat):
>
> 1) Extending the MDB_BAD_VALSIZE description. Simple. However:
>
> This is documented as a user error, and it's one which lmdb
> won't always catch.  If you also called mdb_dbi_open() and the
> DBI got reused, lmdb silently updates the wrong named database.
>
> So I'm a bit wary about both of these fixes: Documenting this
> case can be misleading if it makes users expect it to be caught.
>
> Unless people know better - it corresponds to EBADF for using a
> closed file descriptor, not caught if open() reused it.
>
>
> 2) Catch this case and instead return a generic "something is
> wrong" - MDB_BAD_TXN or MDB_INCOMPATIBLE.
> And maybe assert(did not happen) to teach users to avoid this.
>
> Needs to be done where md_name is used: Commit/drop(named DB),
> page_search(stale named DB), cursor_touch(clean named DB).  Or
> pass F_SUBDATA inwards and tweak the "return MDB_BAD_VALSIZE"
> statements to detect this.
>
> After playing around a bit, I guess this is my favorite.
>
>
> 3) Leave the misleading message alone, which can send the user
> off to a wild goose chase (as described on IRC).
>
>
> Or we can add code to also catch close followed by open:
>
> 4) Put hash(DB name) in MDB_db.md_pad and verify it before
> overwriting a named DB.  Costs execution time even when there's
> no error.  Must be done in commit, drop and maybe cursor_touch.
> (I've pushed a demo patch for just commit.)
>
> 5) Add field MDB_dbx.md_dbiseq = DBI usage sequence number,
> incremented when dbi_open creates (not reuses) a DBI and in
> dbi_close.  Copy it to a new malloced array txn->mt_dbiseqs[]
> in write txns, verify it in at least commit and drop.
> This means close()-open(same DB) will also be caught, which
> seems a good thing since with (4) it will work unreliably:
> Only if open() reuses the same DBI for the same DB.
>

--001a11c13346e990f604f57e2991
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div><div>Hi,<br><br>For further information, see the disc=
ussion here: <a href=3D"https://github.com/Venemo/node-lmdb/pull/20";>https:=
//github.com/Venemo/node-lmdb/pull/20</a><br></div><div>We genuinely though=
t that there had been something wrong with our value sizes somewhere, so I =
wasted several hours debugging in the wrong place. Then, by accident I disc=
overed that it&#39;s in fact an issue with V8&#39;s GC.<br>
</div><div><br>I definitely suggest to choose one of the solutions that don=
&#39;t cost any execution time when there is no error. <br><br></div>Perhap=
s mdb_dbi_close could return an error (or crash) when the dbi is still used=
 by a cursor. That&#39;d eliminate the issue but it&#39;d also mean an API/=
ABI break.<br>
<br></div>Best regards,<br>Timur<br><div><div><br><br></div></div></div><di=
v class=3D"gmail_extra"><br><br><div class=3D"gmail_quote">On Tue, Mar 25, =
2014 at 10:37 AM, Hallvard Breien Furuseth <span dir=3D"ltr">&lt;<a href=3D=
"mailto:h.b.furuseth@usit.uio.no"; target=3D"_blank">h.b.furuseth@usit.uio.n=
o</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">On Sat, 2014-03-22 at 22:49 +0000, <a href=
=3D"mailto:timur.kristof@gmail.com";>timur.kristof@gmail.com</a> wrote:<br>
&gt; When you operate on a dbi using a cursor and accidentally close the db=
i before<br>
&gt; committing the transaction of the cursor, mdb_txn_commit will return<b=
r>
&gt; MDB_BAD_VALSIZE. This is quite misleading because the problem doesn&#3=
9;t actually<br>
&gt; have anything to do with what the error message suggests. (&quot;Too b=
ig key/data,<br>
&gt; key is empty, or wrong DUPFIXED size&quot;)<br>
&gt;<br>
&gt; I suggest to either add a new kind of error code for this situation or=
 to extend<br>
&gt; the description of MDB_BAD_VALSIZE.<br>
<br>
Let&#39;s see -- here are the options I can see (after an IRC chat):<br>
<br>
1) Extending the MDB_BAD_VALSIZE description. Simple. However:<br>
<br>
This is documented as a user error, and it&#39;s one which lmdb<br>
won&#39;t always catch. =C2=A0If you also called mdb_dbi_open() and the<br>
DBI got reused, lmdb silently updates the wrong named database.<br>
<br>
So I&#39;m a bit wary about both of these fixes: Documenting this<br>
case can be misleading if it makes users expect it to be caught.<br>
<br>
Unless people know better - it corresponds to EBADF for using a<br>
closed file descriptor, not caught if open() reused it.<br>
<br>
<br>
2) Catch this case and instead return a generic &quot;something is<br>
wrong&quot; - MDB_BAD_TXN or MDB_INCOMPATIBLE.<br>
And maybe assert(did not happen) to teach users to avoid this.<br>
<br>
Needs to be done where md_name is used: Commit/drop(named DB),<br>
page_search(stale named DB), cursor_touch(clean named DB). =C2=A0Or<br>
pass F_SUBDATA inwards and tweak the &quot;return MDB_BAD_VALSIZE&quot;<br>
statements to detect this.<br>
<br>
After playing around a bit, I guess this is my favorite.<br>
<br>
<br>
3) Leave the misleading message alone, which can send the user<br>
off to a wild goose chase (as described on IRC).<br>
<br>
<br>
Or we can add code to also catch close followed by open:<br>
<br>
4) Put hash(DB name) in MDB_db.md_pad and verify it before<br>
overwriting a named DB. =C2=A0Costs execution time even when there&#39;s<br=
>
no error. =C2=A0Must be done in commit, drop and maybe cursor_touch.<br>
(I&#39;ve pushed a demo patch for just commit.)<br>
<br>
5) Add field MDB_dbx.md_dbiseq =3D DBI usage sequence number,<br>
incremented when dbi_open creates (not reuses) a DBI and in<br>
dbi_close. =C2=A0Copy it to a new malloced array txn-&gt;mt_dbiseqs[]<br>
in write txns, verify it in at least commit and drop.<br>
This means close()-open(same DB) will also be caught, which<br>
seems a good thing since with (4) it will work unreliably:<br>
Only if open() reuses the same DBI for the same DB.<br>
</blockquote></div><br></div>

--001a11c13346e990f604f57e2991--