Issue 9037 - observing crash in mdb_cursor_put()
Summary: observing crash in mdb_cursor_put()
Status: RESOLVED FIXED
Alias: None
Product: LMDB
Classification: Unclassified
Component: liblmdb (show other issues)
Version: 0.9.32
Hardware: All All
: --- normal
Target Milestone: 0.9.33
Assignee: Howard Chu
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-06-17 06:11 UTC by grobins@pulsesecure.net
Modified: 2024-03-26 16:28 UTC (History)
2 users (show)

See Also:


Attachments
Attempted reproduction (2.76 KB, text/x-csrc)
2024-03-22 16:13 UTC, Howard Chu
Details

Note You need to log in before you can comment on or make changes to this issue.
Description grobins@pulsesecure.net 2019-06-17 06:11:29 UTC
Full_Name: Robins George
Version: 0.9.18
OS: Centos
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (121.244.154.134)


I am seeing LMDB crash, please find the stack trace.

#0  0x009e362e in mdb_cursor_put (mc=0xffd59bb8, key=0xffd59d14,
data=0xffd59d0c, flags=0) at mdb.c:6688
#1  0x009e48ec in mdb_put (txn=0xd74d3008, dbi=2, key=0xffd59d14,
data=0xffd59d0c, flags=0) at mdb.c:8771
#2  0x00f5cef8 in LMDB::LMDBContext::createSession (this=0x80cc610,
sid=0x80cd0fc "sid1422419e5fd6cd224f278d74293c50f5ef96593700000000+") at
lmdbint.cc:460
#3  0x0805398e in updateLMDB (request=..., forward=@0xffd59fff) at
sessionserver.cc:1154
#4  processRequestWithoutResponse (request=..., forward=@0xffd59fff) at
sessionserver.cc:1240
#5  0x08056936 in ZSubHandler::ioReady (this=0xffd5aa1c, fd=20) at
sessionserver.cc:1501
#6  0x00f870d2 in runCoreDispatcher (default_t=<value optimized out>, flags=-1)
at fds.cc:889
#7  0x00f87b37 in DSEvntFds::runDispatcher () at fds.cc:945
#8  0x08056248 in main () at sessionserver.cc:1820

Anybody has seen similar issue before?
Comment 1 Howard Chu 2019-06-17 12:25:09 UTC
grobins@pulsesecure.net wrote:
> Full_Name: Robins George
> Version: 0.9.18
> OS: Centos
> URL: ftp://ftp.openldap.org/incoming/
> Submission from: (NULL) (121.244.154.134)
> 
> 
> I am seeing LMDB crash, please find the stack trace.
> 
> #0  0x009e362e in mdb_cursor_put (mc=0xffd59bb8, key=0xffd59d14,
> data=0xffd59d0c, flags=0) at mdb.c:6688
> #1  0x009e48ec in mdb_put (txn=0xd74d3008, dbi=2, key=0xffd59d14,
> data=0xffd59d0c, flags=0) at mdb.c:8771
> #2  0x00f5cef8 in LMDB::LMDBContext::createSession (this=0x80cc610,
> sid=0x80cd0fc "sid1422419e5fd6cd224f278d74293c50f5ef96593700000000+") at
> lmdbint.cc:460
> #3  0x0805398e in updateLMDB (request=..., forward=@0xffd59fff) at
> sessionserver.cc:1154
> #4  processRequestWithoutResponse (request=..., forward=@0xffd59fff) at
> sessionserver.cc:1240
> #5  0x08056936 in ZSubHandler::ioReady (this=0xffd5aa1c, fd=20) at
> sessionserver.cc:1501
> #6  0x00f870d2 in runCoreDispatcher (default_t=<value optimized out>, flags=-1)
> at fds.cc:889
> #7  0x00f87b37 in DSEvntFds::runDispatcher () at fds.cc:945
> #8  0x08056248 in main () at sessionserver.cc:1820
> 
> Anybody has seen similar issue before?

Doesn't sound familiar. But 0.9.18 is quite old. If you can reproduce this
issue in 0.9.23 then we'll take a look. Test code to reproduce the problem
would also be needed.

-- 
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 2 Myk Melez 2019-06-18 00:27:05 UTC
hyc@symas.com wrote on 2019-06-17 05:25:
> grobins@pulsesecure.net  wrote:
>> I am seeing LMDB crash, please find the stack trace.
>>
>> #0  0x009e362e in mdb_cursor_put (mc=0xffd59bb8, key=0xffd59d14,
>> data=0xffd59d0c, flags=0) at mdb.c:6688
>> #1  0x009e48ec in mdb_put (txn=0xd74d3008, dbi=2, key=0xffd59d14,
>> data=0xffd59d0c, flags=0) at mdb.c:8771
>> …
>>
>> Anybody has seen similar issue before?
> Doesn't sound familiar. But 0.9.18 is quite old. If you can reproduce this
> issue in 0.9.23 then we'll take a look. Test code to reproduce the problem
> would also be needed.
I'm seeing this in Firefox, which uses LMDB 0.9.23 (with minor changes). 
Line 6688 in 0.9.18 occurs at line 6938 in 0.9.23 (line 6937 in Firefox 
since we landed ITS#9030), and that's where we see crashes.

I haven't reported it here yet because I haven't been able to confirm 
that it's a bug in LMDB as opposed to my own code. In fact I haven't 
been able to reproduce it at all, I've only seen it in crash reports 
submitted by Firefox installations (almost exclusively on Windows). So I 
don't have test code to reproduce the problem.

Nevertheless, FWIW, here's the Firefox bug that tracks the issue: 
https://bugzilla.mozilla.org/show_bug.cgi?id=1538541. And here are its 
crash reports: 
https://crash-stats.mozilla.org/signature/?signature=mdb_cursor_put 
(only the last seven days of reports shown by default, but this has been 
happening since we started using LMDB in Firefox nightly builds a couple 
of months ago).

I've examined some of the dumps, and mc->mc_top is 0 when the crash 
occurs, while mc->mc_pg[0] is a NULL pointer. So presumably the crash 
occurs because IS_LEAF2 tries to dereference mc->mc_pg[mc->mc_top].

Further investigation shows that insert_data and insert_key are both 
MDB_NOTFOUND, and flags is 0, so it isn't MDB_CURRENT, nor does it 
contain MDB_APPEND. If I understand the code in mdb_cursor_put 
correctly, this means that mdb_cursor_set was called on line 6614.

And mdb_cursor_set is in the stack of another crash I've been 
investigating in mdb_page_search_root 
(https://bugzilla.mozilla.org/show_bug.cgi?id=1550174, 
https://crash-stats.mozilla.org/signature/?signature=mdb_page_search_root), 
which happens on all of Firefox's primary platforms (Windows, macOS, Linux).

But I haven't been able to reproduce that one either, on any of those 
platforms, so I have no idea if they're related (nor if mdb_cursor_set 
is even implicated in this crash). And I still can't say that either is 
an LMDB bug.

-myk


Comment 3 grobins@pulsesecure.net 2019-06-18 06:22:55 UTC
Hello Howard,
we are not able to reproduce the issue consistently so test code is not available at this time, and I do believe this issue will be present in the latest code as well since Myk also reported the same..  Attaching analysis from our side, hope it helps.

The call stack what we have seen is,

    #0 0x0011b62e in mdb_cursor_put (mc=0xffe52d18, key=0xffe52e74, data=0xffe52e6c, flags=0) at mdb.c:6688
    #1 0x0011c8ec in mdb_put (txn=0xdc13f008, dbi=2, key=0xffe52e74, data=0xffe52e6c, flags=0) at mdb.c:8771
    #2 0x00b792b8 in LMDB::LMDBContext::createSession (this=0x80ca418, sid=0x8297be4 "sidebf3708f0fd9fcb8e062529da47adf51402dfc4600000000+") at lmdbint.cc:460
    #3 0x08053564 in updateLMDB (request=..., forward=@0xffe5315f) at sessionserver.cc:1075
    #4 processRequestWithoutResponse (request=..., forward=@0xffe5315f) at sessionserver.cc:1160
    #5 0x08056a80 in processRequest (this=0x80b2ac0, fd=33) at sessionserver.cc:1189
    #6 readFromSock (this=0x80b2ac0, fd=33) at sessionserver.cc:1342
    #7 SockHandler::ioReady (this=0x80b2ac0, fd=33) at sessionserver.cc:1455
    #8 0x00ba2592 in runCoreDispatcher (default_t=<value optimized out>, flags=-1) at fds.cc:889
    #9 0x00ba2ff7 in DSEvntFds::runDispatcher () at fds.cc:945
    #10 0x080559c6 in main () at sessionserver.cc:1739

Here it is helpful to examine frames #0 and #1 in detail in relation to the relevant source code /libraries/liblmdb/mdb.c.

Frame #1

#1  0x0011c8ec in mdb_put (txn=0xdc13f008, dbi=2, key=0xffe52e74, data=0xffe52e6c, flags=0) at mdb.c:8771
        mc = {mc_next = 0x0, mc_backup = 0x0, mc_xcursor = 0x0, mc_txn = 0xdc13f008, mc_dbi = 2, mc_db = 0xdc13f088, mc_dbx = 0xe1215038, mc_dbflag = 0xdc1cf09a "\033", '\032' <repeats 51 times>, mc_snum = 0, mc_top = 0, mc_flags = 0, mc_pg = {0x0, 0x80b2ac0, 0xffe52d68, 0x28afce, 0xdc13f008, 0x80ce9b0, 0xffe52dc8, 0x4c5ff4, 0x2b, 0x80b2ac0, 0xffe52d98, 0x49209a, 0x2b, 0x4c5ff4, 0x2121cb, 0x21576c, 0x8297d34, 0x3497af, 0x21576c, 0x21316a, 0x8297d34, 0x80ac7b0, 0x1e, 0x46eed6, 0x2b, 0x0, 0x80ce9b0, 0x4c5ff4, 0x80ac7b0, 0x8297d34, 0xffe52df8, 0x46fb96}, mc_ki = {0, 2089, 51120, 2058, 30, 0, 16796, 17, 11760, 65509, 3, 0, 32040, 2089, 30, 0, 0, 0, 0, 0, 3, 0, 24564, 76, 51120, 2058, 10944, 2059, 11800, 65509, 64758, 70}} <--------
	
        mx = {mx_cursor = {mc_next = 0x38, mc_backup = 0x3, mc_xcursor = 0x0, mc_txn = 0x0, mc_dbi = 0, mc_db = 0x28, mc_dbx = 0x0, mc_dbflag = 0x0, mc_snum = 3, mc_top = 0, mc_flags = 17, mc_pg = {0x3a93d8, 0x10, 0x0, 0x18, 0x3a93a0, 0x0, 0x3a93d0, 0x289abe, 0x3a93a0, 0xffe52d3f, 0xffe52c68, 0x28afce, 0xffe52dbc, 0xffe52db4, 0x3, 0x4c5ff4, 0x11, 0x36c99b, 0x6e, 0x77, 0x7c, 0x5b, 0x38, 0x289abe, 0x0, 0x0, 0x0, 0x28, 0x0, 0x0, 0x3, 0x10}, mc_ki = {37848, 58, 51611, 54, 110, 0, 119, 0, 124, 0, 91, 0, 58, 0, 39614, 40, 0, 0, 0, 0, 0, 0, 160, 0, 0, 0, 2, 0, 18, 0, 136, 0}}, mx_db = {md_pad = 3838936, md_flags = 51611, md_depth = 54, md_branch_pages = 110, md_leaf_pages = 119, md_overflow_pages = 124, md_entries = 91, md_root = 56}, mx_dbx = {md_name = {mv_size = 6, mv_data = 0x0}, md_cmp = 0, md_dcmp = 0, md_rel = 0x40, md_relctx = 0x0}, mx_dbflag = 0 '\000'}
        rc = 0 

And source code for mdb_put() is from mdb.c

int
mdb_put(MDB_txn *txn, MDB_dbi dbi,
    MDB_val *key, MDB_val *data, unsigned int flags)
{
	MDB_cursor mc;
	MDB_xcursor mx;
	int rc;

	if (!key || !data || !TXN_DBI_EXIST(txn, dbi, DB_USRVALID))
		return EINVAL;

	if (flags & ~(MDB_NOOVERWRITE|MDB_NODUPDATA|MDB_RESERVE|MDB_APPEND|MDB_APPENDDUP))
		return EINVAL;

	if (txn->mt_flags & (MDB_TXN_RDONLY|MDB_TXN_BLOCKED))
		return (txn->mt_flags & MDB_TXN_RDONLY) ? EACCES : MDB_BAD_TXN;

	mdb_cursor_init(&mc, txn, dbi, &mx); <-------- see source code below
	mc.mc_next = txn->mt_cursors[dbi];
	txn->mt_cursors[dbi] = &mc;
	rc = mdb_cursor_put(&mc, key, data, flags); <-------- see Frame #0
	txn->mt_cursors[dbi] = mc.mc_next;
	return rc;
}


Source Code For mdb_cursor_init() From mdb.c
/** Initialize a cursor for a given transaction and database. */
static void
mdb_cursor_init(MDB_cursor *mc, MDB_txn *txn, MDB_dbi dbi, MDB_xcursor *mx)
{
	mc->mc_next = NULL;
	mc->mc_backup = NULL;
	mc->mc_dbi = dbi;
	mc->mc_txn = txn;
	mc->mc_db = &txn->mt_dbs[dbi];
	mc->mc_dbx = &txn->mt_dbxs[dbi];
	mc->mc_dbflag = &txn->mt_dbflags[dbi];
	mc->mc_snum = 0; <-------- 
	mc->mc_top = 0; <--------  
	mc->mc_pg[0] = 0; <-------- 
	mc->mc_ki[0] = 0;
	mc->mc_flags = 0;
	if (txn->mt_dbs[dbi].md_flags & MDB_DUPSORT) {
		mdb_tassert(txn, mx != NULL);
		mc->mc_xcursor = mx;
		mdb_xcursor_init0(mc);
	} else {
		mc->mc_xcursor = NULL;
	}
	if (*mc->mc_dbflag & DB_STALE) { <-------- false *mc->mc_dbflag = 27, DB_STALE = 0x02		/**< Named-DB record is older than txnID */
		mdb_page_search(mc, NULL, MDB_PS_ROOTONLY); 
	}
}

From this we know that when mdb_cursor_put() is called, the following values remain assigned:

              mc->mc_snum = 0; <--------
	mc->mc_top = 0; <--------  
	mc->mc_pg[0] = 0; <-------- 

and its noted these two values still have their initial values at the time of the segmentation fault.

(gdb) fr 0
#0 0x0011b62e in mdb_cursor_put (mc=0xffe52d18, key=0xffe52e74, data=0xffe52e6c, flags=0) at mdb.c:6688
6688 in mdb.c
(gdb) p mc->mc_top
$1 = 0
(gdb) p mc->mc_pg[mc->mc_top]
$19 = (MDB_page *) 0x0

Let us consider how the path taken through mdb_cursor_put() could account for these null variable values which result in a segmentation fault.

Frame #0

#0  0x0011b62e in mdb_cursor_put (mc=0xffe52d18, key=0xffe52e74, data=0xffe52e6c, flags=0) at mdb.c:6688 <-------- flags = 0
        env = 0x80ce9b0
        leaf = <value optimized out>
        fp = <value optimized out>
        mp = 0x38
        sub_root = 0x0 
        fp_flags = <value optimized out>
        xdata = {mv_size = 3838880, mv_data = 0xdc1cf09a}
        rdata = 0xffe52e6c
        dkey = {mv_size = 0, mv_data = 0x3a93cc}
        olddata = {mv_size = 22, mv_data = 0x36c820}
        dummy = {md_pad = 22, md_flags = 11128, md_depth = 65509, md_branch_pages = 16, md_leaf_pages = 1507329, md_overflow_pages = 0, md_entries = 3838928, md_root = 3833844}
        do_sub = 0
        insert_key = -30798 <-------- #define MDB_NOTFOUND	(-30798)
        insert_data = -30798 <-------- #define MDB_NOTFOUND	(-30798)
        mcount = 0
        dcount = 0
        nospill = 0 <--------
        nsize = <value optimized out>
        rc = 0 
        rc2 = <value optimized out>
        nflags = 0 <--------
        __func__ = "mdb_cursor_put"	

We know that mc->mc_top and mc->mc_pg[0] were initialized to 0 in mdb_cursor_init(). As my annotations of the source code below suggest, careful analysis reveals that the paths actually taken through mdb_cursor_put() do not modify these assignments.

Excerpted Source Code For mdb_cursor_put() From mdb.c

int
mdb_cursor_put(MDB_cursor *mc, MDB_val *key, MDB_val *data,
    unsigned int flags)
{
	MDB_env		*env;
	MDB_node	*leaf = NULL;
	MDB_page	*fp, *mp, *sub_root = NULL;
	uint16_t	fp_flags;
	MDB_val		xdata, *rdata, dkey, olddata;
	MDB_db dummy;
	int do_sub = 0, insert_key, insert_data;
	unsigned int mcount = 0, dcount = 0, nospill;
	size_t nsize;
	int rc, rc2;
	unsigned int nflags;
	DKBUF;

	if (mc == NULL || key == NULL)
		return EINVAL;

	env = mc->mc_txn->mt_env;

	/* Check this first so counter will always be zero on any
	 * early failures.
	 */
	if (flags & MDB_MULTIPLE) {
		dcount = data[1].mv_size;
		data[1].mv_size = 0;
		if (!F_ISSET(mc->mc_db->md_flags, MDB_DUPFIXED))
			return MDB_INCOMPATIBLE;
	}

	nospill = flags & MDB_NOSPILL;
	flags &= ~MDB_NOSPILL;

	if (mc->mc_txn->mt_flags & (MDB_TXN_RDONLY|MDB_TXN_BLOCKED))
		return (mc->mc_txn->mt_flags & MDB_TXN_RDONLY) ? EACCES : MDB_BAD_TXN;

	if (key->mv_size-1 >= ENV_MAXKEY(env))
		return MDB_BAD_VALSIZE;

#if SIZE_MAX > MAXDATASIZE
	if (data->mv_size > ((mc->mc_db->md_flags & MDB_DUPSORT) ? ENV_MAXKEY(env) : MAXDATASIZE))
		return MDB_BAD_VALSIZE;
#else
	if ((mc->mc_db->md_flags & MDB_DUPSORT) && data->mv_size > ENV_MAXKEY(env))
		return MDB_BAD_VALSIZE;
#endif

	DPRINTF(("==> put db %d key [%s], size %"Z"u, data size %"Z"u",
		DDBI(mc), DKEY(key), key ? key->mv_size : 0, data->mv_size));

	dkey.mv_size = 0;

	if (flags == MDB_CURRENT) { <-------- false flags = 0
		if (!(mc->mc_flags & C_INITIALIZED))
			return EINVAL;
		rc = MDB_SUCCESS;
	} else if (mc->mc_db->md_root == P_INVALID) { <-------- false, mc->mc_db->md_root = 68
		/* new database, cursor has nothing to point to */
		mc->mc_snum = 0;
		mc->mc_top = 0;
		mc->mc_flags &= ~C_INITIALIZED;
		rc = MDB_NO_ROOT;  
	} else { <-------- executed
		int exact = 0;
		MDB_val d2;
		if (flags & MDB_APPEND) { <-------- false flags = 0
			MDB_val k2;
			rc = mdb_cursor_last(mc, &k2, &d2);
			if (rc == 0) {
				rc = mc->mc_dbx->md_cmp(key, &k2);
				if (rc > 0) {
					rc = MDB_NOTFOUND;
					mc->mc_ki[mc->mc_top]++;
				} else {
					/* new key is <= last key */
					rc = MDB_KEYEXIST;
				}
			}
		} else {<-------- executed
			rc = mdb_cursor_set(mc, key, &d2, MDB_SET, &exact); <-------- returns -30798 which is MDB_NOTFOUND
		}<-------- executed
		if ((flags & MDB_NOOVERWRITE) && rc == 0) { <-------- false flags = 0
			DPRINTF(("duplicate key [%s]", DKEY(key)));
			*data = d2;
			return MDB_KEYEXIST;
		}
		if (rc && rc != MDB_NOTFOUND) <-------- false rc = MDB_NOTFOUND
			return rc;
	}<-------- executed

	if (mc->mc_flags & C_DEL)
		mc->mc_flags ^= C_DEL;

	/* Cursor is positioned, check for room in the dirty list */
	if (!nospill) { <-------- true
		if (flags & MDB_MULTIPLE) {
			rdata = &xdata;
			xdata.mv_size = data->mv_size * dcount;
		} else {
			rdata = data; <-------- executed
		}
		if ((rc2 = mdb_page_spill(mc, key, rdata))) <-------- returned 0 or would have returned
			return rc2;
	}<-------- executed

	if (rc == MDB_NO_ROOT) { <-------- false 
		MDB_page *np;
		/* new database, write a root leaf page */
		DPUTS("allocating new root leaf page");
		if ((rc2 = mdb_page_new(mc, P_LEAF, 1, &np))) {
			return rc2;
		}
		mdb_cursor_push(mc, np);
		mc->mc_db->md_root = np->mp_pgno;
		mc->mc_db->md_depth++;
		*mc->mc_dbflag |= DB_DIRTY;
		if ((mc->mc_db->md_flags & (MDB_DUPSORT|MDB_DUPFIXED))
			== MDB_DUPFIXED)
			np->mp_flags |= P_LEAF2;
		mc->mc_flags |= C_INITIALIZED;
	} else {
		/* make sure all cursor pages are writable */
		rc2 = mdb_cursor_touch(mc); <-------- is this called and returns 0 because it does not return
		if (rc2)
			return rc2;
	} <-------- executed

	insert_key = insert_data = rc; <-------- rc = -30798 or MDB_NOTFOUND!
	if (insert_key) { <-------- true
		/* The key does not exist */
		DPRINTF(("inserting key at index %i", mc->mc_ki[mc->mc_top]));
		if ((mc->mc_db->md_flags & MDB_DUPSORT) &&
			LEAFSIZE(key, data) > env->me_nodemax) <-------- false since mc->mc_db->md_flags = 0
		{
			/* Too big for a node, insert in sub-DB.  Set up an empty
			 * "old sub-page" for prep_subDB to expand to a full page.
			 */
			fp_flags = P_LEAF|P_DIRTY;
			fp = env->me_pbuf;
			fp->mp_pad = data->mv_size; /* used if MDB_DUPFIXED */
			fp->mp_lower = fp->mp_upper = (PAGEHDRSZ-PAGEBASE);
			olddata.mv_size = PAGEHDRSZ;
			goto prep_subDB;
		} <-------- this was not executed, and the very long external else contingency below should be skipped!
	} else {
		/* there's only a key anyway, so this is a no-op */
		if (IS_LEAF2(mc->mc_pg[mc->mc_top])) {
			char *ptr;
			unsigned int ksize = mc->mc_db->md_pad;
			if (key->mv_size != ksize)
				return MDB_BAD_VALSIZE;
			ptr = LEAF2KEY(mc->mc_pg[mc->mc_top], mc->mc_ki[mc->mc_top], ksize);
			memcpy(ptr, key->mv_data, ksize);
fix_parent:
			/* if overwriting slot 0 of leaf, need to
			 * update branch key if there is a parent page
			 */
			if (mc->mc_top && !mc->mc_ki[mc->mc_top]) {
				unsigned short dtop = 1;
				mc->mc_top--;
				/* slot 0 is always an empty key, find real slot */
				while (mc->mc_top && !mc->mc_ki[mc->mc_top]) {
					mc->mc_top--;
					dtop++;
				}
				if (mc->mc_ki[mc->mc_top])
					rc2 = mdb_update_key(mc, key);
				else
					rc2 = MDB_SUCCESS;
				mc->mc_top += dtop;
				if (rc2)
					return rc2;
			}
			return MDB_SUCCESS;
		}

more:
		leaf = NODEPTR(mc->mc_pg[mc->mc_top], mc->mc_ki[mc->mc_top]);
		olddata.mv_size = NODEDSZ(leaf);
		olddata.mv_data = NODEDATA(leaf);

		/* DB has dups? */
		if (F_ISSET(mc->mc_db->md_flags, MDB_DUPSORT)) { 
			/* Prepare (sub-)page/sub-DB to accept the new item,
			 * if needed.  fp: old sub-page or a header faking
			 * it.  mp: new (sub-)page.  offset: growth in page
			 * size.  xdata: node data with new page or DB.
			 */
			unsigned	i, offset = 0;
			mp = fp = xdata.mv_data = env->me_pbuf;
			mp->mp_pgno = mc->mc_pg[mc->mc_top]->mp_pgno;

			/* Was a single item before, must convert now */
			if (!F_ISSET(leaf->mn_flags, F_DUPDATA)) {
				MDB_cmp_func *dcmp;
				/* Just overwrite the current item */
				if (flags == MDB_CURRENT)
					goto current;
				dcmp = mc->mc_dbx->md_dcmp;
#if UINT_MAX < SIZE_MAX
				if (dcmp == mdb_cmp_int && olddata.mv_size == sizeof(size_t))
					dcmp = mdb_cmp_clong;
#endif
				/* does data match? */
				if (!dcmp(data, &olddata)) {
					if (flags & (MDB_NODUPDATA|MDB_APPENDDUP))
						return MDB_KEYEXIST;
					/* overwrite it */
					goto current;
				}

				/* Back up original data item */
				dkey.mv_size = olddata.mv_size;
				dkey.mv_data = memcpy(fp+1, olddata.mv_data, olddata.mv_size);

				/* Make sub-page header for the dup items, with dummy body */
				fp->mp_flags = P_LEAF|P_DIRTY|P_SUBP;
				fp->mp_lower = (PAGEHDRSZ-PAGEBASE);
				xdata.mv_size = PAGEHDRSZ + dkey.mv_size + data->mv_size;
				if (mc->mc_db->md_flags & MDB_DUPFIXED) {
					fp->mp_flags |= P_LEAF2;
					fp->mp_pad = data->mv_size;
					xdata.mv_size += 2 * data->mv_size;	/* leave space for 2 more */
				} else {
					xdata.mv_size += 2 * (sizeof(indx_t) + NODESIZE) +
						(dkey.mv_size & 1) + (data->mv_size & 1);
				}
				fp->mp_upper = xdata.mv_size - PAGEBASE;
				olddata.mv_size = xdata.mv_size; /* pretend olddata is fp */
			} else if (leaf->mn_flags & F_SUBDATA) {
				/* Data is on sub-DB, just store it */
				flags |= F_DUPDATA|F_SUBDATA;
				goto put_sub;
			} else {
				/* Data is on sub-page */
				fp = olddata.mv_data;
				switch (flags) {
				default:
					if (!(mc->mc_db->md_flags & MDB_DUPFIXED)) {
						offset = EVEN(NODESIZE + sizeof(indx_t) +
							data->mv_size);
						break;
					}
					offset = fp->mp_pad;
					if (SIZELEFT(fp) < offset) {
						offset *= 4; /* space for 4 more */
						break;
					}
					/* FALLTHRU: Big enough MDB_DUPFIXED sub-page */
				case MDB_CURRENT:
					fp->mp_flags |= P_DIRTY;
					COPY_PGNO(fp->mp_pgno, mp->mp_pgno);
					mc->mc_xcursor->mx_cursor.mc_pg[0] = fp;
					flags |= F_DUPDATA;
					goto put_sub;
				}
				xdata.mv_size = olddata.mv_size + offset;
			}

			fp_flags = fp->mp_flags;
			if (NODESIZE + NODEKSZ(leaf) + xdata.mv_size > env->me_nodemax) {
					/* Too big for a sub-page, convert to sub-DB */
					fp_flags &= ~P_SUBP;
prep_subDB:
					if (mc->mc_db->md_flags & MDB_DUPFIXED) {
						fp_flags |= P_LEAF2;
						dummy.md_pad = fp->mp_pad;
						dummy.md_flags = MDB_DUPFIXED;
						if (mc->mc_db->md_flags & MDB_INTEGERDUP)
							dummy.md_flags |= MDB_INTEGERKEY;
					} else {
						dummy.md_pad = 0;
						dummy.md_flags = 0;
					}
					dummy.md_depth = 1;
					dummy.md_branch_pages = 0;
					dummy.md_leaf_pages = 1;
					dummy.md_overflow_pages = 0;
					dummy.md_entries = NUMKEYS(fp);
					xdata.mv_size = sizeof(MDB_db);
					xdata.mv_data = &dummy;
					if ((rc = mdb_page_alloc(mc, 1, &mp)))
						return rc;
					offset = env->me_psize - olddata.mv_size;
					flags |= F_DUPDATA|F_SUBDATA;
					dummy.md_root = mp->mp_pgno;
					sub_root = mp;
			}
			if (mp != fp) {
				mp->mp_flags = fp_flags | P_DIRTY;
				mp->mp_pad   = fp->mp_pad;
				mp->mp_lower = fp->mp_lower;
				mp->mp_upper = fp->mp_upper + offset;
				if (fp_flags & P_LEAF2) {
					memcpy(METADATA(mp), METADATA(fp), NUMKEYS(fp) * fp->mp_pad);
				} else {
					memcpy((char *)mp + mp->mp_upper + PAGEBASE, (char *)fp + fp->mp_upper + PAGEBASE,
						olddata.mv_size - fp->mp_upper - PAGEBASE);
					for (i=0; i<NUMKEYS(fp); i++)
						mp->mp_ptrs[i] = fp->mp_ptrs[i] + offset;
				}
			}

			rdata = &xdata;
			flags |= F_DUPDATA;
			do_sub = 1;
			if (!insert_key)
				mdb_node_del(mc, 0);
			goto new_sub;
		}
current:
		/* LMDB passes F_SUBDATA in 'flags' to write a DB record */
		if ((leaf->mn_flags ^ flags) & F_SUBDATA)
			return MDB_INCOMPATIBLE;
		/* overflow page overwrites need special handling */
		if (F_ISSET(leaf->mn_flags, F_BIGDATA)) { <-------- is this taken?
			MDB_page *omp;
			pgno_t pg;
			int level, ovpages, dpages = OVPAGES(data->mv_size, env->me_psize);

			memcpy(&pg, olddata.mv_data, sizeof(pg));
			if ((rc2 = mdb_page_get(mc->mc_txn, pg, &omp, &level)) != 0)
				return rc2;
			ovpages = omp->mp_pages;

			/* Is the ov page large enough? */
			if (ovpages >= dpages) {
			  if (!(omp->mp_flags & P_DIRTY) &&
				  (level || (env->me_flags & MDB_WRITEMAP)))
			  {
				rc = mdb_page_unspill(mc->mc_txn, omp, &omp); <-------- is this taken?
				if (rc)
					return rc;
				level = 0;		/* dirty in this txn or clean */
			  }
			  /* Is it dirty? */
			  if (omp->mp_flags & P_DIRTY) {
				/* yes, overwrite it. Note in this case we don't
				 * bother to try shrinking the page if the new data
				 * is smaller than the overflow threshold.
				 */
				if (level > 1) {
					/* It is writable only in a parent txn */
					size_t sz = (size_t) env->me_psize * ovpages, off;
					MDB_page *np = mdb_page_malloc(mc->mc_txn, ovpages);
					MDB_ID2 id2;
					if (!np)
						return ENOMEM;
					id2.mid = pg;
					id2.mptr = np;
					/* Note - this page is already counted in parent's dirty_room */
					rc2 = mdb_mid2l_insert(mc->mc_txn->mt_u.dirty_list, &id2);
					mdb_cassert(mc, rc2 == 0);
					if (!(flags & MDB_RESERVE)) {
						/* Copy end of page, adjusting alignment so
						 * compiler may copy words instead of bytes.
						 */
						off = (PAGEHDRSZ + data->mv_size) & -sizeof(size_t);
						memcpy((size_t *)((char *)np + off),
							(size_t *)((char *)omp + off), sz - off);
						sz = PAGEHDRSZ;
					}
					memcpy(np, omp, sz); /* Copy beginning of page */
					omp = np;
				}
				SETDSZ(leaf, data->mv_size);
				if (F_ISSET(flags, MDB_RESERVE))
					data->mv_data = METADATA(omp);
				else
					memcpy(METADATA(omp), data->mv_data, data->mv_size);
				return MDB_SUCCESS;
			  }
			}
			if ((rc2 = mdb_ovpage_free(mc, omp)) != MDB_SUCCESS)
				return rc2;
		} else if (data->mv_size == olddata.mv_size) {
			/* same size, just replace it. Note that we could
			 * also reuse this node if the new data is smaller,
			 * but instead we opt to shrink the node in that case.
			 */
			if (F_ISSET(flags, MDB_RESERVE))
				data->mv_data = olddata.mv_data;
			else if (!(mc->mc_flags & C_SUB))
				memcpy(olddata.mv_data, data->mv_data, data->mv_size);
			else {
				memcpy(NODEKEY(leaf), key->mv_data, key->mv_size);
				goto fix_parent;
			}
			return MDB_SUCCESS;
		}
		mdb_node_del(mc, 0);
	} <-------- This entire else contingency was likely skipped!!

	rdata = data;

new_sub:
	nflags = flags & NODE_ADD_FLAGS;
	nsize = IS_LEAF2(mc->mc_pg[mc->mc_top]) ? key->mv_size : mdb_leaf_size(env, key, rdata); <-------- SEGV
...	

This may be an identifiable bug in mdb_cursor_put() as It does not safely handle the case when the noted call to mdb_cursor_set() returns MDB_NOTFOUND (-30798)!

It explicitly fails to preemptively exit by returning an error in this case.

The value of -30798 is in turn assigned to insert_key, which is therefore non-null in the condition of the subsequent if statement.

	insert_key = insert_data = rc; <-------- insert_key = -30798 or MDB_NOTFOUND!
	if (insert_key) { <-------- true
		/* The key does not exist */
		DPRINTF(("inserting key at index %i", mc->mc_ki[mc->mc_top]));
		if ((mc->mc_db->md_flags & MDB_DUPSORT) &&
			LEAFSIZE(key, data) > env->me_nodemax) <-------- false since mc->mc_db->md_flags = 0
		{
			/* Too big for a node, insert in sub-DB.  Set up an empty
			 * "old sub-page" for prep_subDB to expand to a full page.
			 */
			fp_flags = P_LEAF|P_DIRTY;
			fp = env->me_pbuf;
			fp->mp_pad = data->mv_size; /* used if MDB_DUPFIXED */
			fp->mp_lower = fp->mp_upper = (PAGEHDRSZ-PAGEBASE);
			olddata.mv_size = PAGEHDRSZ;
			goto prep_subDB;
		} <-------- this was not executed, and the very long else contingency below is skipped!
	} else {
Since mc->mc_db->md_flags = 0, the contingency in the embedded if statement fails.

The segmentation fault occurs due to the call of IS_LEAF2 on the null pointer mc->mc_pg[0].

	rdata = data;

new_sub:
	nflags = flags & NODE_ADD_FLAGS;
	nsize = IS_LEAF2(mc->mc_pg[mc->mc_top]) ? key->mv_size : mdb_leaf_size(env, key, rdata); <-------- SEGV


Regards,
Robins.

-----Original Message-----
From: Howard Chu <hyc@symas.com> 
Sent: Monday, June 17, 2019 5:55 PM
To: Robins George <grobins@pulsesecure.net>; openldap-its@OpenLDAP.org
Subject: Re: (ITS#9037) observing crash in mdb_cursor_put()

CAUTION: This email originated from outside of PulseSecure. Do not click links or open attachments unless you recognize the sender and know the content is safe.


grobins@pulsesecure.net wrote:
> Full_Name: Robins George
> Version: 0.9.18
> OS: Centos
> URL: ftp://ftp.openldap.org/incoming/
> Submission from: (NULL) (121.244.154.134)
>
>
> I am seeing LMDB crash, please find the stack trace.
>
> #0  0x009e362e in mdb_cursor_put (mc=0xffd59bb8, key=0xffd59d14, 
> data=0xffd59d0c, flags=0) at mdb.c:6688
> #1  0x009e48ec in mdb_put (txn=0xd74d3008, dbi=2, key=0xffd59d14, 
> data=0xffd59d0c, flags=0) at mdb.c:8771
> #2  0x00f5cef8 in LMDB::LMDBContext::createSession (this=0x80cc610, 
> sid=0x80cd0fc "sid1422419e5fd6cd224f278d74293c50f5ef96593700000000+") 
> at
> lmdbint.cc:460
> #3  0x0805398e in updateLMDB (request=..., forward=@0xffd59fff) at
> sessionserver.cc:1154
> #4  processRequestWithoutResponse (request=..., forward=@0xffd59fff) 
> at
> sessionserver.cc:1240
> #5  0x08056936 in ZSubHandler::ioReady (this=0xffd5aa1c, fd=20) at
> sessionserver.cc:1501
> #6  0x00f870d2 in runCoreDispatcher (default_t=<value optimized out>, 
> flags=-1) at fds.cc:889
> #7  0x00f87b37 in DSEvntFds::runDispatcher () at fds.cc:945
> #8  0x08056248 in main () at sessionserver.cc:1820
>
> Anybody has seen similar issue before?


Doesn't sound familiar. But 0.9.18 is quite old. If you can reproduce this issue in 0.9.23 then we'll take a look. Test code to reproduce the problem would also be needed.

--
  -- Howard Chu
  CTO, Symas Corp.           https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.symas.com&amp;data=02%7C01%7C%7C003b907d2dde44028adc08d6f31ed9c4%7C3290a9179dd643db843ba3e376f9f96c%7C0%7C0%7C636963711169251949&amp;sdata=2uGRoMGVdhCniUDhXfzPgAmiV1kt66dW7u8rDmGt6aY%3D&amp;reserved=0
  Director, Highland Sun     https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fhighlandsun.com%2Fhyc%2F&amp;data=02%7C01%7C%7C003b907d2dde44028adc08d6f31ed9c4%7C3290a9179dd643db843ba3e376f9f96c%7C0%7C0%7C636963711169251949&amp;sdata=3%2FLRPM%2BHtIaRoKaMz%2F27%2Fo4ZT8XhTcFJTvypBssx0r8%3D&amp;reserved=0
  Chief Architect, OpenLDAP  https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.openldap.org%2Fproject%2F&amp;data=02%7C01%7C%7C003b907d2dde44028adc08d6f31ed9c4%7C3290a9179dd643db843ba3e376f9f96c%7C0%7C0%7C636963711169251949&amp;sdata=7iTXMg2NSeXG%2FKp9qaHQPOC%2FKbjCV%2BK3Js%2FenhFpLGY%3D&amp;reserved=0
Comment 4 Howard Chu 2019-06-18 08:30:15 UTC
Robins George wrote:
> Hello Howard,
> we are not able to reproduce the issue consistently so test code is not available at this time, and I do believe this issue will be present in the latest code as well since Myk also reported the same..  Attaching analysis from our side, hope it helps.

Thanks. Notes below...
> 
> The call stack what we have seen is,
> 
>     #0 0x0011b62e in mdb_cursor_put (mc=0xffe52d18, key=0xffe52e74, data=0xffe52e6c, flags=0) at mdb.c:6688
>     #1 0x0011c8ec in mdb_put (txn=0xdc13f008, dbi=2, key=0xffe52e74, data=0xffe52e6c, flags=0) at mdb.c:8771
>     #2 0x00b792b8 in LMDB::LMDBContext::createSession (this=0x80ca418, sid=0x8297be4 "sidebf3708f0fd9fcb8e062529da47adf51402dfc4600000000+") at lmdbint.cc:460
>     #3 0x08053564 in updateLMDB (request=..., forward=@0xffe5315f) at sessionserver.cc:1075
>     #4 processRequestWithoutResponse (request=..., forward=@0xffe5315f) at sessionserver.cc:1160
>     #5 0x08056a80 in processRequest (this=0x80b2ac0, fd=33) at sessionserver.cc:1189
>     #6 readFromSock (this=0x80b2ac0, fd=33) at sessionserver.cc:1342
>     #7 SockHandler::ioReady (this=0x80b2ac0, fd=33) at sessionserver.cc:1455
>     #8 0x00ba2592 in runCoreDispatcher (default_t=<value optimized out>, flags=-1) at fds.cc:889
>     #9 0x00ba2ff7 in DSEvntFds::runDispatcher () at fds.cc:945
>     #10 0x080559c6 in main () at sessionserver.cc:1739
> 
> Here it is helpful to examine frames #0 and #1 in detail in relation to the relevant source code /libraries/liblmdb/mdb.c.
> 
> Frame #1
> 
> #1  0x0011c8ec in mdb_put (txn=0xdc13f008, dbi=2, key=0xffe52e74, data=0xffe52e6c, flags=0) at mdb.c:8771
>         mc = {mc_next = 0x0, mc_backup = 0x0, mc_xcursor = 0x0, mc_txn = 0xdc13f008, mc_dbi = 2, mc_db = 0xdc13f088, mc_dbx = 0xe1215038, mc_dbflag = 0xdc1cf09a "\033", '\032' <repeats 51 times>, mc_snum = 0, mc_top = 0, mc_flags = 0, mc_pg = {0x0, 0x80b2ac0, 0xffe52d68, 0x28afce, 0xdc13f008, 0x80ce9b0, 0xffe52dc8, 0x4c5ff4, 0x2b, 0x80b2ac0, 0xffe52d98, 0x49209a, 0x2b, 0x4c5ff4, 0x2121cb, 0x21576c, 0x8297d34, 0x3497af, 0x21576c, 0x21316a, 0x8297d34, 0x80ac7b0, 0x1e, 0x46eed6, 0x2b, 0x0, 0x80ce9b0, 0x4c5ff4, 0x80ac7b0, 0x8297d34, 0xffe52df8, 0x46fb96}, mc_ki = {0, 2089, 51120, 2058, 30, 0, 16796, 17, 11760, 65509, 3, 0, 32040, 2089, 30, 0, 0, 0, 0, 0, 3, 0, 24564, 76, 51120, 2058, 10944, 2059, 11800, 65509, 64758, 70}} <--------
> 	
>         mx = {mx_cursor = {mc_next = 0x38, mc_backup = 0x3, mc_xcursor = 0x0, mc_txn = 0x0, mc_dbi = 0, mc_db = 0x28, mc_dbx = 0x0, mc_dbflag = 0x0, mc_snum = 3, mc_top = 0, mc_flags = 17, mc_pg = {0x3a93d8, 0x10, 0x0, 0x18, 0x3a93a0, 0x0, 0x3a93d0, 0x289abe, 0x3a93a0, 0xffe52d3f, 0xffe52c68, 0x28afce, 0xffe52dbc, 0xffe52db4, 0x3, 0x4c5ff4, 0x11, 0x36c99b, 0x6e, 0x77, 0x7c, 0x5b, 0x38, 0x289abe, 0x0, 0x0, 0x0, 0x28, 0x0, 0x0, 0x3, 0x10}, mc_ki = {37848, 58, 51611, 54, 110, 0, 119, 0, 124, 0, 91, 0, 58, 0, 39614, 40, 0, 0, 0, 0, 0, 0, 160, 0, 0, 0, 2, 0, 18, 0, 136, 0}}, mx_db = {md_pad = 3838936, md_flags = 51611, md_depth = 54, md_branch_pages = 110, md_leaf_pages = 119, md_overflow_pages = 124, md_entries = 91, md_root = 56}, mx_dbx = {md_name = {mv_size = 6, mv_data = 0x0}, md_cmp = 0, md_dcmp = 0, md_rel = 0x40, md_relctx = 0x0}, mx_dbflag = 0 '\000'}
>         rc = 0 
> 
> And source code for mdb_put() is from mdb.c
> 
> int
> mdb_put(MDB_txn *txn, MDB_dbi dbi,
>     MDB_val *key, MDB_val *data, unsigned int flags)
> {
> 	MDB_cursor mc;
> 	MDB_xcursor mx;
> 	int rc;
> 
> 	if (!key || !data || !TXN_DBI_EXIST(txn, dbi, DB_USRVALID))
> 		return EINVAL;
> 
> 	if (flags & ~(MDB_NOOVERWRITE|MDB_NODUPDATA|MDB_RESERVE|MDB_APPEND|MDB_APPENDDUP))
> 		return EINVAL;
> 
> 	if (txn->mt_flags & (MDB_TXN_RDONLY|MDB_TXN_BLOCKED))
> 		return (txn->mt_flags & MDB_TXN_RDONLY) ? EACCES : MDB_BAD_TXN;
> 
> 	mdb_cursor_init(&mc, txn, dbi, &mx); <-------- see source code below
> 	mc.mc_next = txn->mt_cursors[dbi];
> 	txn->mt_cursors[dbi] = &mc;
> 	rc = mdb_cursor_put(&mc, key, data, flags); <-------- see Frame #0
> 	txn->mt_cursors[dbi] = mc.mc_next;
> 	return rc;
> }
> 
> 
> Source Code For mdb_cursor_init() From mdb.c
> /** Initialize a cursor for a given transaction and database. */
> static void
> mdb_cursor_init(MDB_cursor *mc, MDB_txn *txn, MDB_dbi dbi, MDB_xcursor *mx)
> {
> 	mc->mc_next = NULL;
> 	mc->mc_backup = NULL;
> 	mc->mc_dbi = dbi;
> 	mc->mc_txn = txn;
> 	mc->mc_db = &txn->mt_dbs[dbi];
> 	mc->mc_dbx = &txn->mt_dbxs[dbi];
> 	mc->mc_dbflag = &txn->mt_dbflags[dbi];
> 	mc->mc_snum = 0; <-------- 
> 	mc->mc_top = 0; <--------  
> 	mc->mc_pg[0] = 0; <-------- 
> 	mc->mc_ki[0] = 0;
> 	mc->mc_flags = 0;
> 	if (txn->mt_dbs[dbi].md_flags & MDB_DUPSORT) {
> 		mdb_tassert(txn, mx != NULL);
> 		mc->mc_xcursor = mx;
> 		mdb_xcursor_init0(mc);
> 	} else {
> 		mc->mc_xcursor = NULL;
> 	}
> 	if (*mc->mc_dbflag & DB_STALE) { <-------- false *mc->mc_dbflag = 27, DB_STALE = 0x02		/**< Named-DB record is older than txnID */

27 = 0x1b, so DB_STALE is true. mdb_page_search ought to have been invoked to set the root of the cursor.
> 		mdb_page_search(mc, NULL, MDB_PS_ROOTONLY); 
> 	}
> }
> 
> From this we know that when mdb_cursor_put() is called, the following values remain assigned:
> 
>               mc->mc_snum = 0; <--------
> 	mc->mc_top = 0; <--------  
> 	mc->mc_pg[0] = 0; <-------- 
> 
> and its noted these two values still have their initial values at the time of the segmentation fault.

There is no issue with mdb_cursor_put then. The question is why didn't mdb_page_search find the DBI's root node?

-- 
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 5 grobins@pulsesecure.net 2019-06-18 08:59:06 UTC
<repost since the previous message came as 64 encoded, apologies>



Hello Howard,

we are not able to reproduce the issue consistently so test code is not available at this time, and I do believe this issue will be present in the latest code as well since Myk also reported the same..  Attaching analysis from our side, hope it helps.



The call stack what we have seen is,



    #0 0x0011b62e in mdb_cursor_put (mc=0xffe52d18, key=0xffe52e74, data=0xffe52e6c, flags=0) at mdb.c:6688

    #1 0x0011c8ec in mdb_put (txn=0xdc13f008, dbi=2, key=0xffe52e74, data=0xffe52e6c, flags=0) at mdb.c:8771

    #2 0x00b792b8 in LMDB::LMDBContext::createSession (this=0x80ca418, sid=0x8297be4 "sidebf3708f0fd9fcb8e062529da47adf51402dfc4600000000+") at lmdbint.cc:460

    #3 0x08053564 in updateLMDB (request=..., forward=@0xffe5315f) at sessionserver.cc:1075

    #4 processRequestWithoutResponse (request=..., forward=@0xffe5315f) at sessionserver.cc:1160

    #5 0x08056a80 in processRequest (this=0x80b2ac0, fd=33) at sessionserver.cc:1189

    #6 readFromSock (this=0x80b2ac0, fd=33) at sessionserver.cc:1342

    #7 SockHandler::ioReady (this=0x80b2ac0, fd=33) at sessionserver.cc:1455

    #8 0x00ba2592 in runCoreDispatcher (default_t=<value optimized out>, flags=-1) at fds.cc:889

    #9 0x00ba2ff7 in DSEvntFds::runDispatcher () at fds.cc:945

    #10 0x080559c6 in main () at sessionserver.cc:1739



Here it is helpful to examine frames #0 and #1 in detail in relation to the relevant source code /libraries/liblmdb/mdb.c.



Frame #1



#1  0x0011c8ec in mdb_put (txn=0xdc13f008, dbi=2, key=0xffe52e74, data=0xffe52e6c, flags=0) at mdb.c:8771

        mc = {mc_next = 0x0, mc_backup = 0x0, mc_xcursor = 0x0, mc_txn = 0xdc13f008, mc_dbi = 2, mc_db = 0xdc13f088, mc_dbx = 0xe1215038, mc_dbflag = 0xdc1cf09a "\033", '\032' <repeats 51 times>, mc_snum = 0, mc_top = 0, mc_flags = 0, mc_pg = {0x0, 0x80b2ac0, 0xffe52d68, 0x28afce, 0xdc13f008, 0x80ce9b0, 0xffe52dc8, 0x4c5ff4, 0x2b, 0x80b2ac0, 0xffe52d98, 0x49209a, 0x2b, 0x4c5ff4, 0x2121cb, 0x21576c, 0x8297d34, 0x3497af, 0x21576c, 0x21316a, 0x8297d34, 0x80ac7b0, 0x1e, 0x46eed6, 0x2b, 0x0, 0x80ce9b0, 0x4c5ff4, 0x80ac7b0, 0x8297d34, 0xffe52df8, 0x46fb96}, mc_ki = {0, 2089, 51120, 2058, 30, 0, 16796, 17, 11760, 65509, 3, 0, 32040, 2089, 30, 0, 0, 0, 0, 0, 3, 0, 24564, 76, 51120, 2058, 10944, 2059, 11800, 65509, 64758, 70}} <--------



        mx = {mx_cursor = {mc_next = 0x38, mc_backup = 0x3, mc_xcursor = 0x0, mc_txn = 0x0, mc_dbi = 0, mc_db = 0x28, mc_dbx = 0x0, mc_dbflag = 0x0, mc_snum = 3, mc_top = 0, mc_flags = 17, mc_pg = {0x3a93d8, 0x10, 0x0, 0x18, 0x3a93a0, 0x0, 0x3a93d0, 0x289abe, 0x3a93a0, 0xffe52d3f, 0xffe52c68, 0x28afce, 0xffe52dbc, 0xffe52db4, 0x3, 0x4c5ff4, 0x11, 0x36c99b, 0x6e, 0x77, 0x7c, 0x5b, 0x38, 0x289abe, 0x0, 0x0, 0x0, 0x28, 0x0, 0x0, 0x3, 0x10}, mc_ki = {37848, 58, 51611, 54, 110, 0, 119, 0, 124, 0, 91, 0, 58, 0, 39614, 40, 0, 0, 0, 0, 0, 0, 160, 0, 0, 0, 2, 0, 18, 0, 136, 0}}, mx_db = {md_pad = 3838936, md_flags = 51611, md_depth = 54, md_branch_pages = 110, md_leaf_pages = 119, md_overflow_pages = 124, md_entries = 91, md_root = 56}, mx_dbx = {md_name = {mv_size = 6, mv_data = 0x0}, md_cmp = 0, md_dcmp = 0, md_rel = 0x40, md_relctx = 0x0}, mx_dbflag = 0 '\000'}

        rc = 0



And source code for mdb_put() is from mdb.c



int

mdb_put(MDB_txn *txn, MDB_dbi dbi,

    MDB_val *key, MDB_val *data, unsigned int flags) {

                MDB_cursor mc;

                MDB_xcursor mx;

                int rc;



                if (!key || !data || !TXN_DBI_EXIST(txn, dbi, DB_USRVALID))

                                return EINVAL;



                if (flags & ~(MDB_NOOVERWRITE|MDB_NODUPDATA|MDB_RESERVE|MDB_APPEND|MDB_APPENDDUP))

                                return EINVAL;



                if (txn->mt_flags & (MDB_TXN_RDONLY|MDB_TXN_BLOCKED))

                                return (txn->mt_flags & MDB_TXN_RDONLY) ? EACCES : MDB_BAD_TXN;



                mdb_cursor_init(&mc, txn, dbi, &mx); <-------- see source code below

                mc.mc_next = txn->mt_cursors[dbi];

                txn->mt_cursors[dbi] = &mc;

                rc = mdb_cursor_put(&mc, key, data, flags); <-------- see Frame #0

                txn->mt_cursors[dbi] = mc.mc_next;

                return rc;

}





Source Code For mdb_cursor_init() From mdb.c

/** Initialize a cursor for a given transaction and database. */ static void mdb_cursor_init(MDB_cursor *mc, MDB_txn *txn, MDB_dbi dbi, MDB_xcursor *mx) {

                mc->mc_next = NULL;

                mc->mc_backup = NULL;

                mc->mc_dbi = dbi;

                mc->mc_txn = txn;

                mc->mc_db = &txn->mt_dbs[dbi];

                mc->mc_dbx = &txn->mt_dbxs[dbi];

                mc->mc_dbflag = &txn->mt_dbflags[dbi];

                mc->mc_snum = 0; <--------

                mc->mc_top = 0; <--------

                mc->mc_pg[0] = 0; <--------

                mc->mc_ki[0] = 0;

                mc->mc_flags = 0;

                if (txn->mt_dbs[dbi].md_flags & MDB_DUPSORT) {

                                mdb_tassert(txn, mx != NULL);

                                mc->mc_xcursor = mx;

                                mdb_xcursor_init0(mc);

                } else {

                                mc->mc_xcursor = NULL;

                }

                if (*mc->mc_dbflag & DB_STALE) { <-------- false *mc->mc_dbflag = 27, DB_STALE = 0x02                  /**< Named-DB record is older than txnID */

                                mdb_page_search(mc, NULL, MDB_PS_ROOTONLY);

                }

}



From this we know that when mdb_cursor_put() is called, the following values remain assigned:



              mc->mc_snum = 0; <--------

                mc->mc_top = 0; <--------

                mc->mc_pg[0] = 0; <--------



and its noted these two values still have their initial values at the time of the segmentation fault.



(gdb) fr 0

#0 0x0011b62e in mdb_cursor_put (mc=0xffe52d18, key=0xffe52e74, data=0xffe52e6c, flags=0) at mdb.c:6688

6688 in mdb.c

(gdb) p mc->mc_top

$1 = 0

(gdb) p mc->mc_pg[mc->mc_top]

$19 = (MDB_page *) 0x0



Let us consider how the path taken through mdb_cursor_put() could account for these null variable values which result in a segmentation fault.



Frame #0



#0  0x0011b62e in mdb_cursor_put (mc=0xffe52d18, key=0xffe52e74, data=0xffe52e6c, flags=0) at mdb.c:6688 <-------- flags = 0

        env = 0x80ce9b0

        leaf = <value optimized out>

        fp = <value optimized out>

        mp = 0x38

        sub_root = 0x0

        fp_flags = <value optimized out>

        xdata = {mv_size = 3838880, mv_data = 0xdc1cf09a}

        rdata = 0xffe52e6c

        dkey = {mv_size = 0, mv_data = 0x3a93cc}

        olddata = {mv_size = 22, mv_data = 0x36c820}

        dummy = {md_pad = 22, md_flags = 11128, md_depth = 65509, md_branch_pages = 16, md_leaf_pages = 1507329, md_overflow_pages = 0, md_entries = 3838928, md_root = 3833844}

        do_sub = 0

        insert_key = -30798 <-------- #define MDB_NOTFOUND          (-30798)

        insert_data = -30798 <-------- #define MDB_NOTFOUND        (-30798)

        mcount = 0

        dcount = 0

        nospill = 0 <--------

        nsize = <value optimized out>

        rc = 0

        rc2 = <value optimized out>

        nflags = 0 <--------

        __func__ = "mdb_cursor_put"



We know that mc->mc_top and mc->mc_pg[0] were initialized to 0 in mdb_cursor_init(). As my annotations of the source code below suggest, careful analysis reveals that the paths actually taken through mdb_cursor_put() do not modify these assignments.



Excerpted Source Code For mdb_cursor_put() From mdb.c



int

mdb_cursor_put(MDB_cursor *mc, MDB_val *key, MDB_val *data,

    unsigned int flags)

{

                MDB_env                            *env;

                MDB_node         *leaf = NULL;

                MDB_page         *fp, *mp, *sub_root = NULL;

                uint16_t               fp_flags;

                MDB_val                              xdata, *rdata, dkey, olddata;

                MDB_db dummy;

                int do_sub = 0, insert_key, insert_data;

                unsigned int mcount = 0, dcount = 0, nospill;

                size_t nsize;

                int rc, rc2;

                unsigned int nflags;

                DKBUF;



                if (mc == NULL || key == NULL)

                                return EINVAL;



                env = mc->mc_txn->mt_env;



                /* Check this first so counter will always be zero on any

                * early failures.

                */

                if (flags & MDB_MULTIPLE) {

                                dcount = data[1].mv_size;

                                data[1].mv_size = 0;

                                if (!F_ISSET(mc->mc_db->md_flags, MDB_DUPFIXED))

                                                return MDB_INCOMPATIBLE;

                }



                nospill = flags & MDB_NOSPILL;

                flags &= ~MDB_NOSPILL;



                if (mc->mc_txn->mt_flags & (MDB_TXN_RDONLY|MDB_TXN_BLOCKED))

                                return (mc->mc_txn->mt_flags & MDB_TXN_RDONLY) ? EACCES : MDB_BAD_TXN;



                if (key->mv_size-1 >= ENV_MAXKEY(env))

                                return MDB_BAD_VALSIZE;



#if SIZE_MAX > MAXDATASIZE

                if (data->mv_size > ((mc->mc_db->md_flags & MDB_DUPSORT) ? ENV_MAXKEY(env) : MAXDATASIZE))

                                return MDB_BAD_VALSIZE;

#else

                if ((mc->mc_db->md_flags & MDB_DUPSORT) && data->mv_size > ENV_MAXKEY(env))

                                return MDB_BAD_VALSIZE;

#endif



                DPRINTF(("==> put db %d key [%s], size %"Z"u, data size %"Z"u",

                                DDBI(mc), DKEY(key), key ? key->mv_size : 0, data->mv_size));



                dkey.mv_size = 0;



                if (flags == MDB_CURRENT) { <-------- false flags = 0

                                if (!(mc->mc_flags & C_INITIALIZED))

                                                return EINVAL;

                                rc = MDB_SUCCESS;

                } else if (mc->mc_db->md_root == P_INVALID) { <-------- false, mc->mc_db->md_root = 68

                                /* new database, cursor has nothing to point to */

                                mc->mc_snum = 0;

                                mc->mc_top = 0;

                                mc->mc_flags &= ~C_INITIALIZED;

                                rc = MDB_NO_ROOT;

                } else { <-------- executed

                                int exact = 0;

                                MDB_val d2;

                                if (flags & MDB_APPEND) { <-------- false flags = 0

                                                MDB_val k2;

                                                rc = mdb_cursor_last(mc, &k2, &d2);

                                                if (rc == 0) {

                                                                rc = mc->mc_dbx->md_cmp(key, &k2);

                                                                if (rc > 0) {

                                                                                rc = MDB_NOTFOUND;

                                                                                mc->mc_ki[mc->mc_top]++;

                                                                } else {

                                                                                /* new key is <= last key */

                                                                                rc = MDB_KEYEXIST;

                                                                }

                                                }

                                } else {<-------- executed

                                                rc = mdb_cursor_set(mc, key, &d2, MDB_SET, &exact); <-------- returns -30798 which is MDB_NOTFOUND

                                }<-------- executed

                                if ((flags & MDB_NOOVERWRITE) && rc == 0) { <-------- false flags = 0

                                                DPRINTF(("duplicate key [%s]", DKEY(key)));

                                                *data = d2;

                                                return MDB_KEYEXIST;

                                }

                                if (rc && rc != MDB_NOTFOUND) <-------- false rc = MDB_NOTFOUND

                                                return rc;

                }<-------- executed



                if (mc->mc_flags & C_DEL)

                                mc->mc_flags ^= C_DEL;



                /* Cursor is positioned, check for room in the dirty list */

                if (!nospill) { <-------- true

                                if (flags & MDB_MULTIPLE) {

                                                rdata = &xdata;

                                                xdata.mv_size = data->mv_size * dcount;

                                } else {

                                                rdata = data; <-------- executed

                                }

                                if ((rc2 = mdb_page_spill(mc, key, rdata))) <-------- returned 0 or would have returned

                                                return rc2;

                }<-------- executed



                if (rc == MDB_NO_ROOT) { <-------- false

                                MDB_page *np;

                                /* new database, write a root leaf page */

                                DPUTS("allocating new root leaf page");

                                if ((rc2 = mdb_page_new(mc, P_LEAF, 1, &np))) {

                                                return rc2;

                                }

                                mdb_cursor_push(mc, np);

                                mc->mc_db->md_root = np->mp_pgno;

                                mc->mc_db->md_depth++;

                                *mc->mc_dbflag |= DB_DIRTY;

                                if ((mc->mc_db->md_flags & (MDB_DUPSORT|MDB_DUPFIXED))

                                                == MDB_DUPFIXED)

                                                np->mp_flags |= P_LEAF2;

                                mc->mc_flags |= C_INITIALIZED;

                } else {

                                /* make sure all cursor pages are writable */

                                rc2 = mdb_cursor_touch(mc); <-------- is this called and returns 0 because it does not return

                                if (rc2)

                                                return rc2;

                } <-------- executed



                insert_key = insert_data = rc; <-------- rc = -30798 or MDB_NOTFOUND!

                if (insert_key) { <-------- true

                                /* The key does not exist */

                                DPRINTF(("inserting key at index %i", mc->mc_ki[mc->mc_top]));

                                if ((mc->mc_db->md_flags & MDB_DUPSORT) &&

                                                LEAFSIZE(key, data) > env->me_nodemax) <-------- false since mc->mc_db->md_flags = 0

                                {

                                                /* Too big for a node, insert in sub-DB.  Set up an empty

                                                * "old sub-page" for prep_subDB to expand to a full page.

                                                */

                                                fp_flags = P_LEAF|P_DIRTY;

                                                fp = env->me_pbuf;

                                                fp->mp_pad = data->mv_size; /* used if MDB_DUPFIXED */

                                                fp->mp_lower = fp->mp_upper = (PAGEHDRSZ-PAGEBASE);

                                                olddata.mv_size = PAGEHDRSZ;

                                                goto prep_subDB;

                                } <-------- this was not executed, and the very long external else contingency below should be skipped!

                } else {

                                /* there's only a key anyway, so this is a no-op */

                                if (IS_LEAF2(mc->mc_pg[mc->mc_top])) {

                                                char *ptr;

                                                unsigned int ksize = mc->mc_db->md_pad;

                                                if (key->mv_size != ksize)

                                                                return MDB_BAD_VALSIZE;

                                                ptr = LEAF2KEY(mc->mc_pg[mc->mc_top], mc->mc_ki[mc->mc_top], ksize);

                                                memcpy(ptr, key->mv_data, ksize);

fix_parent:

                                                /* if overwriting slot 0 of leaf, need to

                                                * update branch key if there is a parent page

                                                */

                                                if (mc->mc_top && !mc->mc_ki[mc->mc_top]) {

                                                                unsigned short dtop = 1;

                                                                mc->mc_top--;

                                                                /* slot 0 is always an empty key, find real slot */

                                                                while (mc->mc_top && !mc->mc_ki[mc->mc_top]) {

                                                                                mc->mc_top--;

                                                                                dtop++;

                                                                }

                                                                if (mc->mc_ki[mc->mc_top])

                                                                                rc2 = mdb_update_key(mc, key);

                                                                else

                                                                                rc2 = MDB_SUCCESS;

                                                                mc->mc_top += dtop;

                                                                if (rc2)

                                                                                return rc2;

                                                }

                                                return MDB_SUCCESS;

                                }



more:

                                leaf = NODEPTR(mc->mc_pg[mc->mc_top], mc->mc_ki[mc->mc_top]);

                                olddata.mv_size = NODEDSZ(leaf);

                                olddata.mv_data = NODEDATA(leaf);



                                /* DB has dups? */

                                if (F_ISSET(mc->mc_db->md_flags, MDB_DUPSORT)) {

                                                /* Prepare (sub-)page/sub-DB to accept the new item,

                                                * if needed.  fp: old sub-page or a header faking

                                                * it.  mp: new (sub-)page.  offset: growth in page

                                                * size.  xdata: node data with new page or DB.

                                                */

                                                unsigned             i, offset = 0;

                                                mp = fp = xdata.mv_data = env->me_pbuf;

                                                mp->mp_pgno = mc->mc_pg[mc->mc_top]->mp_pgno;



                                                /* Was a single item before, must convert now */

                                                if (!F_ISSET(leaf->mn_flags, F_DUPDATA)) {

                                                                MDB_cmp_func *dcmp;

                                                                /* Just overwrite the current item */

                                                                if (flags == MDB_CURRENT)

                                                                                goto current;

                                                                dcmp = mc->mc_dbx->md_dcmp;

#if UINT_MAX < SIZE_MAX

                                                                if (dcmp == mdb_cmp_int && olddata.mv_size == sizeof(size_t))

                                                                                dcmp = mdb_cmp_clong;

#endif

                                                                /* does data match? */

                                                                if (!dcmp(data, &olddata)) {

                                                                                if (flags & (MDB_NODUPDATA|MDB_APPENDDUP))

                                                                                                return MDB_KEYEXIST;

                                                                                /* overwrite it */

                                                                                goto current;

                                                                }



                                                                /* Back up original data item */

                                                                dkey.mv_size = olddata.mv_size;

                                                                dkey.mv_data = memcpy(fp+1, olddata.mv_data, olddata.mv_size);



                                                                /* Make sub-page header for the dup items, with dummy body */

                                                                fp->mp_flags = P_LEAF|P_DIRTY|P_SUBP;

                                                                fp->mp_lower = (PAGEHDRSZ-PAGEBASE);

                                                                xdata.mv_size = PAGEHDRSZ + dkey.mv_size + data->mv_size;

                                                                if (mc->mc_db->md_flags & MDB_DUPFIXED) {

                                                                                fp->mp_flags |= P_LEAF2;

                                                                                fp->mp_pad = data->mv_size;

                                                                                xdata.mv_size += 2 * data->mv_size;      /* leave space for 2 more */

                                                                } else {

                                                                                xdata.mv_size += 2 * (sizeof(indx_t) + NODESIZE) +

                                                                                                (dkey.mv_size & 1) + (data->mv_size & 1);

                                                                }

                                                                fp->mp_upper = xdata.mv_size - PAGEBASE;

                                                                olddata.mv_size = xdata.mv_size; /* pretend olddata is fp */

                                                } else if (leaf->mn_flags & F_SUBDATA) {

                                                                /* Data is on sub-DB, just store it */

                                                                flags |= F_DUPDATA|F_SUBDATA;

                                                                goto put_sub;

                                                } else {

                                                                /* Data is on sub-page */

                                                                fp = olddata.mv_data;

                                                                switch (flags) {

                                                                default:

                                                                                if (!(mc->mc_db->md_flags & MDB_DUPFIXED)) {

                                                                                                offset = EVEN(NODESIZE + sizeof(indx_t) +

                                                                                                                data->mv_size);

                                                                                                break;

                                                                                }

                                                                                offset = fp->mp_pad;

                                                                                if (SIZELEFT(fp) < offset) {

                                                                                                offset *= 4; /* space for 4 more */

                                                                                                break;

                                                                                }

                                                                                /* FALLTHRU: Big enough MDB_DUPFIXED sub-page */

                                                                case MDB_CURRENT:

                                                                                fp->mp_flags |= P_DIRTY;

                                                                                COPY_PGNO(fp->mp_pgno, mp->mp_pgno);

                                                                                mc->mc_xcursor->mx_cursor.mc_pg[0] = fp;

                                                                                flags |= F_DUPDATA;

                                                                                goto put_sub;

                                                                }

                                                                xdata.mv_size = olddata.mv_size + offset;

                                                }



                                                fp_flags = fp->mp_flags;

                                                if (NODESIZE + NODEKSZ(leaf) + xdata.mv_size > env->me_nodemax) {

                                                                                /* Too big for a sub-page, convert to sub-DB */

                                                                                fp_flags &= ~P_SUBP;

prep_subDB:

                                                                                if (mc->mc_db->md_flags & MDB_DUPFIXED) {

                                                                                                fp_flags |= P_LEAF2;

                                                                                                dummy.md_pad = fp->mp_pad;

                                                                                                dummy.md_flags = MDB_DUPFIXED;

                                                                                                if (mc->mc_db->md_flags & MDB_INTEGERDUP)

                                                                                                                dummy.md_flags |= MDB_INTEGERKEY;

                                                                                } else {

                                                                                                dummy.md_pad = 0;

                                                                                                dummy.md_flags = 0;

                                                                                }

                                                                                dummy.md_depth = 1;

                                                                                dummy.md_branch_pages = 0;

                                                                                dummy.md_leaf_pages = 1;

                                                                                dummy.md_overflow_pages = 0;

                                                                                dummy.md_entries = NUMKEYS(fp);

                                                                                xdata.mv_size = sizeof(MDB_db);

                                                                                xdata.mv_data = &dummy;

                                                                                if ((rc = mdb_page_alloc(mc, 1, &mp)))

                                                                                                return rc;

                                                                                offset = env->me_psize - olddata.mv_size;

                                                                                flags |= F_DUPDATA|F_SUBDATA;

                                                                                dummy.md_root = mp->mp_pgno;

                                                                                sub_root = mp;

                                                }

                                                if (mp != fp) {

                                                                mp->mp_flags = fp_flags | P_DIRTY;

                                                                mp->mp_pad   = fp->mp_pad;

                                                                mp->mp_lower = fp->mp_lower;

                                                                mp->mp_upper = fp->mp_upper + offset;

                                                                if (fp_flags & P_LEAF2) {

                                                                                memcpy(METADATA(mp), METADATA(fp), NUMKEYS(fp) * fp->mp_pad);

                                                                } else {

                                                                                memcpy((char *)mp + mp->mp_upper + PAGEBASE, (char *)fp + fp->mp_upper + PAGEBASE,

                                                                                                olddata.mv_size - fp->mp_upper - PAGEBASE);

                                                                                for (i=0; i<NUMKEYS(fp); i++)

                                                                                                mp->mp_ptrs[i] = fp->mp_ptrs[i] + offset;

                                                                }

                                                }



                                                rdata = &xdata;

                                                flags |= F_DUPDATA;

                                                do_sub = 1;

                                                if (!insert_key)

                                                                mdb_node_del(mc, 0);

                                                goto new_sub;

                                }

current:

                                /* LMDB passes F_SUBDATA in 'flags' to write a DB record */

                                if ((leaf->mn_flags ^ flags) & F_SUBDATA)

                                                return MDB_INCOMPATIBLE;

                                /* overflow page overwrites need special handling */

                                if (F_ISSET(leaf->mn_flags, F_BIGDATA)) { <-------- is this taken?

                                                MDB_page *omp;

                                                pgno_t pg;

                                                int level, ovpages, dpages = OVPAGES(data->mv_size, env->me_psize);



                                                memcpy(&pg, olddata.mv_data, sizeof(pg));

                                                if ((rc2 = mdb_page_get(mc->mc_txn, pg, &omp, &level)) != 0)

                                                                return rc2;

                                                ovpages = omp->mp_pages;



                                                /* Is the ov page large enough? */

                                                if (ovpages >= dpages) {

                                                  if (!(omp->mp_flags & P_DIRTY) &&

                                                                  (level || (env->me_flags & MDB_WRITEMAP)))

                                                  {

                                                                rc = mdb_page_unspill(mc->mc_txn, omp, &omp); <-------- is this taken?

                                                                if (rc)

                                                                                return rc;

                                                                level = 0;                              /* dirty in this txn or clean */

                                                  }

                                                  /* Is it dirty? */

                                                  if (omp->mp_flags & P_DIRTY) {

                                                                /* yes, overwrite it. Note in this case we don't

                                                                * bother to try shrinking the page if the new data

                                                                * is smaller than the overflow threshold.

                                                                */

                                                                if (level > 1) {

                                                                                /* It is writable only in a parent txn */

                                                                                size_t sz = (size_t) env->me_psize * ovpages, off;

                                                                                MDB_page *np = mdb_page_malloc(mc->mc_txn, ovpages);

                                                                                MDB_ID2 id2;

                                                                                if (!np)

                                                                                                return ENOMEM;

                                                                                id2.mid = pg;

                                                                                id2.mptr = np;

                                                                                /* Note - this page is already counted in parent's dirty_room */

                                                                                rc2 = mdb_mid2l_insert(mc->mc_txn->mt_u.dirty_list, &id2);

                                                                                mdb_cassert(mc, rc2 == 0);

                                                                                if (!(flags & MDB_RESERVE)) {

                                                                                                /* Copy end of page, adjusting alignment so

                                                                                                * compiler may copy words instead of bytes.

                                                                                                */

                                                                                                off = (PAGEHDRSZ + data->mv_size) & -sizeof(size_t);

                                                                                                memcpy((size_t *)((char *)np + off),

                                                                                                                (size_t *)((char *)omp + off), sz - off);

                                                                                                sz = PAGEHDRSZ;

                                                                                }

                                                                                memcpy(np, omp, sz); /* Copy beginning of page */

                                                                                omp = np;

                                                                }

                                                                SETDSZ(leaf, data->mv_size);

                                                                if (F_ISSET(flags, MDB_RESERVE))

                                                                                data->mv_data = METADATA(omp);

                                                                else

                                                                                memcpy(METADATA(omp), data->mv_data, data->mv_size);

                                                                return MDB_SUCCESS;

                                                  }

                                                }

                                                if ((rc2 = mdb_ovpage_free(mc, omp)) != MDB_SUCCESS)

                                                                return rc2;

                                } else if (data->mv_size == olddata.mv_size) {

                                                /* same size, just replace it. Note that we could

                                                * also reuse this node if the new data is smaller,

                                                * but instead we opt to shrink the node in that case.

                                                */

                                                if (F_ISSET(flags, MDB_RESERVE))

                                                                data->mv_data = olddata.mv_data;

                                                else if (!(mc->mc_flags & C_SUB))

                                                                memcpy(olddata.mv_data, data->mv_data, data->mv_size);

                                                else {

                                                                memcpy(NODEKEY(leaf), key->mv_data, key->mv_size);

                                                                goto fix_parent;

                                                }

                                                return MDB_SUCCESS;

                                }

                                mdb_node_del(mc, 0);

                } <-------- This entire else contingency was likely skipped!!



                rdata = data;



new_sub:

                nflags = flags & NODE_ADD_FLAGS;

                nsize = IS_LEAF2(mc->mc_pg[mc->mc_top]) ? key->mv_size : mdb_leaf_size(env, key, rdata); <-------- SEGV

...



This may be an identifiable bug in mdb_cursor_put() as It does not safely handle the case when the noted call to mdb_cursor_set() returns MDB_NOTFOUND (-30798)!



It explicitly fails to preemptively exit by returning an error in this case.



The value of -30798 is in turn assigned to insert_key, which is therefore non-null in the condition of the subsequent if statement.



                insert_key = insert_data = rc; <-------- insert_key = -30798 or MDB_NOTFOUND!

                if (insert_key) { <-------- true

                                /* The key does not exist */

                                DPRINTF(("inserting key at index %i", mc->mc_ki[mc->mc_top]));

                                if ((mc->mc_db->md_flags & MDB_DUPSORT) &&

                                                LEAFSIZE(key, data) > env->me_nodemax) <-------- false since mc->mc_db->md_flags = 0

                                {

                                                /* Too big for a node, insert in sub-DB.  Set up an empty

                                                * "old sub-page" for prep_subDB to expand to a full page.

                                                */

                                                fp_flags = P_LEAF|P_DIRTY;

                                                fp = env->me_pbuf;

                                                fp->mp_pad = data->mv_size; /* used if MDB_DUPFIXED */

                                                fp->mp_lower = fp->mp_upper = (PAGEHDRSZ-PAGEBASE);

                                                olddata.mv_size = PAGEHDRSZ;

                                                goto prep_subDB;

                                } <-------- this was not executed, and the very long else contingency below is skipped!

                } else {

Since mc->mc_db->md_flags = 0, the contingency in the embedded if statement fails.



The segmentation fault occurs due to the call of IS_LEAF2 on the null pointer mc->mc_pg[0].



                rdata = data;



new_sub:

                nflags = flags & NODE_ADD_FLAGS;

                nsize = IS_LEAF2(mc->mc_pg[mc->mc_top]) ? key->mv_size : mdb_leaf_size(env, key, rdata); <-------- SEGV


Regards,
Robins.
Comment 6 Myk Melez 2019-06-19 16:39:01 UTC
hyc@symas.com wrote on 2019-06-18 01:30:
> There is no issue with mdb_cursor_put then. The question is why didn't mdb_page_search find the DBI's root node?
Would it be useful for mdb_cursor_init to return the result code of 
mdb_page_search, and for mdb_put to then return that code if it isn't 
MDB_SUCCESS?

That wouldn't resolve the problem, but it might avoid the crash and 
provide some insight into the failure (if mdb_page_search is actually 
returning a failure code when it doesn't find the DBI's root node).

-myk


Comment 7 Howard Chu 2019-06-19 16:55:02 UTC
myk@mykzilla.org wrote:
> hyc@symas.com wrote on 2019-06-18 01:30:
>> There is no issue with mdb_cursor_put then. The question is why didn't mdb_page_search find the DBI's root node?
> Would it be useful for mdb_cursor_init to return the result code of 
> mdb_page_search, and for mdb_put to then return that code if it isn't 
> MDB_SUCCESS?

Maybe. For testing purposes you could try something like this
diff --git a/libraries/liblmdb/mdb.c b/libraries/liblmdb/mdb.c
index 692feaa38b..e41f3bc36a 100644
--- a/libraries/liblmdb/mdb.c
+++ b/libraries/liblmdb/mdb.c
@@ -7620,7 +7620,8 @@ mdb_cursor_init(MDB_cursor *mc, MDB_txn *txn, MDB_dbi dbi, MDB_xcursor *mx)
                mc->mc_xcursor = NULL;
        }
        if (*mc->mc_dbflag & DB_STALE) {
-               mdb_page_search(mc, NULL, MDB_PS_ROOTONLY);
+               int rc = mdb_page_search(mc, NULL, MDB_PS_ROOTONLY);
+               mdb_cassert(mc, !rc);
        }
 }


I suspect the problem will simply be that you've used an invalid DBI.
> 
> That wouldn't resolve the problem, but it might avoid the crash and 
> provide some insight into the failure (if mdb_page_search is actually 
> returning a failure code when it doesn't find the DBI's root node).
> 
> -myk
> 
> 
> 
> 
> 


-- 
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 8 Myk Melez 2019-06-21 06:35:08 UTC
hyc@symas.com wrote on 2019-06-19 09:55:
> Maybe. For testing purposes you could try something like this
> diff --git a/libraries/liblmdb/mdb.c b/libraries/liblmdb/mdb.c
> index 692feaa38b..e41f3bc36a 100644
> --- a/libraries/liblmdb/mdb.c
> +++ b/libraries/liblmdb/mdb.c
> @@ -7620,7 +7620,8 @@ mdb_cursor_init(MDB_cursor *mc, MDB_txn *txn, MDB_dbi dbi, MDB_xcursor *mx)
>                  mc->mc_xcursor = NULL;
>          }
>          if (*mc->mc_dbflag & DB_STALE) {
> -               mdb_page_search(mc, NULL, MDB_PS_ROOTONLY);
> +               int rc = mdb_page_search(mc, NULL, MDB_PS_ROOTONLY);
> +               mdb_cassert(mc, !rc);
>          }
>   }
I'm unsure if it's related, but with this change to mdb.c, the following 
test program (which resembles Firefox's usage) asserts in 
mdb_cursor_init because rc is MDB_NOTFOUND.

#include <stdio.h>
#include <stdlib.h>
#include "lmdb.h"

#define E(expr) CHECK((rc = (expr)) == MDB_SUCCESS, #expr)
#define CHECK(test, msg) ((test) ? (void)0 : ((void)fprintf(stderr, \
     "%s:%d: %s: %s\n", __FILE__, __LINE__, msg, mdb_strerror(rc)), 
abort()))

int main(int argc,char * argv[])
{
     int rc;
     MDB_env *env;
     MDB_dbi dbi;
     MDB_val key, data;
     MDB_txn *txn;
     char sval[] = "foo";
     char dval[] = "bar";

     E(mdb_env_create(&env));
     E(mdb_env_set_maxdbs(env, 2));
     E(mdb_env_open(env, "./mytestdb", 0, 0664));

     E(mdb_txn_begin(env, NULL, 0, &txn));
     E(mdb_dbi_open(txn, "subdb", MDB_CREATE, &dbi));
     E(mdb_txn_commit(txn));

     key.mv_size = 3;
     key.mv_data = sval;
     data.mv_size = 3;
     data.mv_data = dval;

     E(mdb_txn_begin(env, NULL, 0, &txn));
     E(mdb_put(txn, dbi, &key, &data, 0));
     E(mdb_txn_commit(txn));

     mdb_dbi_close(env, dbi);
     mdb_env_close(env);

     return 0;
}

Without the change to mdb.c, however, the program appears to work correctly.

-myk


Comment 9 Myk Melez 2019-06-21 06:38:56 UTC
myk@mykzilla.org wrote on 2019-06-20 23:35:
> I'm unsure if it's related, but with this change to mdb.c, the following
> test program (which resembles Firefox's usage) asserts in
> mdb_cursor_init because rc is MDB_NOTFOUND.
Erm, here's the program without the errant characters:

#include <stdio.h>
#include <stdlib.h>
#include "lmdb.h"

#define E(expr) CHECK((rc = (expr)) == MDB_SUCCESS, #expr)
#define CHECK(test, msg) ((test) ? (void)0 : ((void)fprintf(stderr, \
     "%s:%d: %s: %s\n", __FILE__, __LINE__, msg, mdb_strerror(rc)),
abort()))

int main(int argc,char * argv[])
{
     int rc;
     MDB_env *env;
     MDB_dbi dbi;
     MDB_val key, data;
     MDB_txn *txn;
     char sval[] = "foo";
     char dval[] = "bar";

     E(mdb_env_create(&env));
     E(mdb_env_set_maxdbs(env, 2));
     E(mdb_env_open(env, "./mytestdb", 0, 0664));

     E(mdb_txn_begin(env, NULL, 0, &txn));
     E(mdb_dbi_open(txn, "subdb", MDB_CREATE, &dbi));
     E(mdb_txn_commit(txn));

     key.mv_size = 3;
     key.mv_data = sval;
     data.mv_size = 3;
     data.mv_data = dval;

     E(mdb_txn_begin(env, NULL, 0, &txn));
     E(mdb_put(txn, dbi, &key, &data, 0));
     E(mdb_txn_commit(txn));

     mdb_dbi_close(env, dbi);
     mdb_env_close(env);

     return 0;
}


Comment 10 Myk Melez 2019-06-21 06:45:10 UTC
myk@mykzilla.org wrote on 2019-06-20 23:39:
> myk@mykzilla.org wrote on 2019-06-20 23:35:
>> I'm unsure if it's related, but with this change to mdb.c, the following
>> test program (which resembles Firefox's usage) asserts in
>> mdb_cursor_init because rc is MDB_NOTFOUND.
> Erm, here's the program without the errant characters:
My apologies, I'm unsure why the characters appeared the second time as 
well. (I tested first by manually removing them, sending the message to 
myself, and then confirming they didn't appear.) I've saved the program 
to this GitHub gist:

https://gist.github.com/mykmelez/ac75a7ab0a0d09a7d709be63806d01d7

-myk


Comment 11 Myk Melez 2019-06-21 21:31:37 UTC
After stepping through the test program in a debugger, I see that 
mdb_page_search returns MDB_NOTFOUND if mc->mc_db->md_root is P_INVALID 
because the tree is empty. And mdb_cursor_put sets rc to MDB_NO_ROOT if 
mc->mc_db->md_root is P_INVALID.

So that can't be related to the crash, at least not in the instances I'm 
investigating, because insert_key and insert_data (and hence rc) is 
always MDB_NOTFOUND in their minidumps when the crash occurs, which 
can't happen when rc is set to MDB_NO_ROOT.

-myk


Comment 12 Victor Porof 2019-07-31 10:30:57 UTC
Hey folks.

From Myk’s investigations in the previous followup, it seems that the suggested changes to `mdb_cursor_init` to avoid using an invalid DBI might not be solving the actual issue, given the behaviour of `mdb_page_search`.

It’s also causing the seemingly correct test program to assert when it wasn’t before. It’s unclear whether this should be the case or not, perhaps Howard can confirm the expected behaviour.

In any case, we’re wondering if there’s been any other progress, or if someone managed to reproduce this issue? Shipping new features built on top of LMDB in Firefox is currently blocked due to these failures, so any additional info would be helpful.

Victor


Comment 13 Howard Chu 2019-08-01 17:47:03 UTC
vporof@mozilla.com wrote:
> Hey folks.
> 
> =46rom Myk=E2=80=99s investigations in the previous followup, it seems =
> that the suggested changes to `mdb_cursor_init` to avoid using an =
> invalid DBI might not be solving the actual issue, given the behaviour =
> of `mdb_page_search`.

Agreed, that assert that I suggested isn't catching what we want.

> It=E2=80=99s also causing the seemingly correct test program to assert =
> when it wasn=E2=80=99t before. It=E2=80=99s unclear whether this should =
> be the case or not, perhaps Howard can confirm the expected behaviour.
> 
> In any case, we=E2=80=99re wondering if there=E2=80=99s been any other =
> progress, or if someone managed to reproduce this issue? Shipping new =
> features built on top of LMDB in Firefox is currently blocked due to =
> these failures, so any additional info would be helpful.

Sorry, still not seeing this over here. What else do you know about the
systems where this is occurring? RAM size, storage on HDD / SSD / USB ?

-- 
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 14 Victor Porof 2019-08-06 11:43:44 UTC

> On 1. Aug 2019, at 19:47, Howard Chu <hyc@symas.com> wrote:
> 
> vporof@mozilla.com wrote:
>> Hey folks.
>> 
>> =46rom Myk=E2=80=99s investigations in the previous followup, it seems =
>> that the suggested changes to `mdb_cursor_init` to avoid using an =
>> invalid DBI might not be solving the actual issue, given the behaviour =
>> of `mdb_page_search`.
> 
> Agreed, that assert that I suggested isn't catching what we want.

FWIW, here's my findings when attempting to look into what was happening with regards to that test case: https://gist.github.com/victorporof/d1f98b8a52f79c7ff99f361d3a2adc3e

It's unclear how much overlap there is with the previous findings, and whether or not calling `mdb_put` should assert with the DBI previously opened via `mdb_dbi_open` with MDB_CREATE. Let me know if what I'm observing here is expected.

> 
>> It=E2=80=99s also causing the seemingly correct test program to assert =
>> when it wasn=E2=80=99t before. It=E2=80=99s unclear whether this should =
>> be the case or not, perhaps Howard can confirm the expected behaviour.
>> 
>> In any case, we=E2=80=99re wondering if there=E2=80=99s been any other =
>> progress, or if someone managed to reproduce this issue? Shipping new =
>> features built on top of LMDB in Firefox is currently blocked due to =
>> these failures, so any additional info would be helpful.
> 
> Sorry, still not seeing this over here. What else do you know about the
> systems where this is occurring? RAM size, storage on HDD / SSD / USB ?

Here's everything we know about: https://crash-stats.mozilla.org/report/index/5d77bd19-41ce-459f-9c1c-7f9fb0190324 See the "details" and "telemetry environment" sections for a breakdown.

Victor

> 
> -- 
>  -- Howard Chu
>  CTO, Symas Corp.           http://www.symas.com
>  Director, Highland Sun     http://highlandsun.com/hyc/
>  Chief Architect, OpenLDAP  http://www.openldap.org/project/


Comment 15 Victor Porof 2019-08-13 15:42:56 UTC
> On 6. Aug 2019, at 13:43, Victor Porof <vporof@mozilla.com> wrote:
> 
> Here's everything we know about: https://crash-stats.mozilla.org/report/index/5d77bd19-41ce-459f-9c1c-7f9fb0190324 See the "details" and "telemetry environment" sections for a breakdown.

Since that link is now expired, here’s an up to date place to see all recent crash reports involving `mdb_cursor_put`: https://crash-stats.mozilla.org/signature/?signature=mdb_cursor_put#reports <https://crash-stats.mozilla.org/signature/?signature=mdb_cursor_put#reports> 

Victor
Comment 16 Victor Porof 2019-08-13 15:53:27 UTC
Apologies for the previous botched message, I was unfamiliar with the 
formatting rules for this list.


To reiterate:

Since the link to the crash stats from my earlier reply now expired, 
here's an up to date place to see all recent crash reports involving 
`mdb_cursor_put`: 

https://crash-stats.mozilla.org/signature/?signature=mdb_cursor_put#reports 

Victor

Comment 17 Victor Porof 2019-08-14 13:59:31 UTC
Quick update on this.

Potentially useful to the efforts here, Dana Keeler has implemented a fuzzing harness for RKV (our typed Rust interface to LMDB): https://github.com/mozkeeler/rkv-fuzz

We weren't able to trigger the `mdb_cursor_put` crashes reported at https://crash-stats.mozilla.org/signature/?signature=mdb_cursor_put#reports, though we did end up finding a few others: https://github.com/mozkeeler/rkv-fuzz/tree/trunk/crashes

Howard, do any of those look familiar and potentially related?

Victor


Comment 18 Howard Chu 2019-08-14 14:20:27 UTC
vporof@mozilla.com wrote:
> Quick update on this.
> 
> Potentially useful to the efforts here, Dana Keeler has implemented a =
> fuzzing harness for RKV (our typed Rust interface to LMDB): =
> https://github.com/mozkeeler/rkv-fuzz
> 
> We weren't able to trigger the `mdb_cursor_put` crashes reported at =
> https://crash-stats.mozilla.org/signature/?signature=3Dmdb_cursor_put#repo=
> rts, though we did end up finding a few others: =
> https://github.com/mozkeeler/rkv-fuzz/tree/trunk/crashes

What kinds of files are these, in the crashes directory?
> 
> Howard, do any of those look familiar and potentially related?
> 
> Victor
> 
> 
> 
> 
> 


-- 
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 19 Victor Porof 2019-08-14 14:47:42 UTC
Thanks for getting back!

The files at https://github.com/mozkeeler/rkv-fuzz/tree/trunk/crashes 
are fuzzed variants of the input data seeded from the mdb input here: 
They're generated when running the `american fuzzy lop` fuzzer:
http://lcamtuf.coredump.cx/afl/.

Victor

Comment 20 Howard Chu 2019-08-14 18:47:28 UTC
vporof@mozilla.com wrote:
> Thanks for getting back!
> 
> The files at https://github.com/mozkeeler/rkv-fuzz/tree/trunk/crashes 
> are fuzzed variants of the input data seeded from the mdb input here: 
> They're generated when running the `american fuzzy lop` fuzzer:
> http://lcamtuf.coredump.cx/afl/.

It appears that rust is doing link-time optimization and deleting LMDB debug
functions from the resulting binary. I'm not familiar with how the rust build
system works, can we get it to stop doing this?

-- 
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 21 Victor Porof 2019-08-14 18:55:25 UTC
It might be possible that even though RKV was built in debug mode, the
backing sys crate isn't building LMDB itself this way. On a very first 
glance it looks like this is the case, always going for opt level 2:
https://github.com/danburkert/lmdb-rs/blob/master/lmdb-sys/build.rs#L23

I'll have a closer look at the build process and see what's going on.

Can you share your build process that resulted in LMDB debug functions
being optimized away? Did you build using the fuzzer via `cargo afl`?
Something else?

Comment 22 Howard Chu 2019-08-14 19:12:01 UTC
vporof@mozilla.com wrote:
> It might be possible that even though RKV was built in debug mode, the
> backing sys crate isn't building LMDB itself this way. On a very first 
> glance it looks like this is the case, always going for opt level 2:
> https://github.com/danburkert/lmdb-rs/blob/master/lmdb-sys/build.rs#L23
> 
> I'll have a closer look at the build process and see what's going on.
> 
> Can you share your build process that resulted in LMDB debug functions
> being optimized away? Did you build using the fuzzer via `cargo afl`?
> Something else?

I deleted the lmdb-rkv-sys-* directories from target/debug/build, edited
 .cargo/registry/src/github.com-1ecc6299db9ec823/lmdb-rkv-sys-0.8.6/lmdb/libraries/liblmdb/mdb.c
 and added #define MDB_DEBUG 3 at the top of the file, then reran the cargo build command.

RUSTFLAGS="-Clink-arg=-fuse-ld=gold" cargo afl build -v

-- 
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 23 Howard Chu 2019-08-14 20:17:58 UTC
hyc@symas.com wrote:
> vporof@mozilla.com wrote:
>> It might be possible that even though RKV was built in debug mode, the
>> backing sys crate isn't building LMDB itself this way. On a very first 
>> glance it looks like this is the case, always going for opt level 2:
>> https://github.com/danburkert/lmdb-rs/blob/master/lmdb-sys/build.rs#L23
>>
>> I'll have a closer look at the build process and see what's going on.
>>
>> Can you share your build process that resulted in LMDB debug functions
>> being optimized away? Did you build using the fuzzer via `cargo afl`?
>> Something else?
> 
> I deleted the lmdb-rkv-sys-* directories from target/debug/build, edited
>  .cargo/registry/src/github.com-1ecc6299db9ec823/lmdb-rkv-sys-0.8.6/lmdb/libraries/liblmdb/mdb.c
>  and added #define MDB_DEBUG 3 at the top of the file, then reran the cargo build command.
> 
> RUSTFLAGS="-Clink-arg=-fuse-ld=gold" cargo afl build -v

Not really sure it's worth spending time on these fuzzer results. Basically
you're feeding corrupted database files into LMDB, and it is hitting an
assert because it sees that the structure is corrupted. Working as designed,
in other words.

-- 
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 24 Victor Porof 2019-08-15 12:09:40 UTC
That's good to know. Since we're completely in the dark on how to
produce a reliable test case that exercises this crash, the hope was
that fuzzing could inch us towards a solution.

There's also the problem that we're observing these crashes on Windows 7
and x86-64 (at least for now), as evidenced by this report:
https://crash-stats.mozilla.org/signature/?signature=mdb_cursor_put&_columns=platform&_columns=cpu_arch&_sort=-date&page=1#reports

Fuzzing on that platform is quite a bit more difficult since tooling is 
lacklustre. Furthermore, we can't really try fuzzing under WSL and 
Windows 10, since LMDB also doesn't actually work well under WSL, due to
mmap on a 0-length file failing with ENOEXEC, also complicating things:
https://github.com/microsoft/WSL/issues/3451

I'm not clear if Robins George (OP) observed this crash on a different
platform, perhaps they can confirm?

If fuzzing isn't a worthwhile exploration avenue, any suggestions for 
what kind of tests might exercise a crash like this?

Comment 25 Howard Chu 2020-03-08 15:08:46 UTC
Robins George wrote:
> Hello Howard,

Hello again. Revisiting this trace and walkthrough:

> we are not able to reproduce the issue consistently so test code is not available at this time, and I do believe this issue will be present in the latest code as well since Myk also reported the same..  Attaching analysis from our side, hope it helps.
> 
> The call stack what we have seen is,
> 
>     #0 0x0011b62e in mdb_cursor_put (mc=0xffe52d18, key=0xffe52e74, data=0xffe52e6c, flags=0) at mdb.c:6688
>     #1 0x0011c8ec in mdb_put (txn=0xdc13f008, dbi=2, key=0xffe52e74, data=0xffe52e6c, flags=0) at mdb.c:8771
>     #2 0x00b792b8 in LMDB::LMDBContext::createSession (this=0x80ca418, sid=0x8297be4 "sidebf3708f0fd9fcb8e062529da47adf51402dfc4600000000+") at lmdbint.cc:460
>     #3 0x08053564 in updateLMDB (request=..., forward=@0xffe5315f) at sessionserver.cc:1075
>     #4 processRequestWithoutResponse (request=..., forward=@0xffe5315f) at sessionserver.cc:1160
>     #5 0x08056a80 in processRequest (this=0x80b2ac0, fd=33) at sessionserver.cc:1189
>     #6 readFromSock (this=0x80b2ac0, fd=33) at sessionserver.cc:1342
>     #7 SockHandler::ioReady (this=0x80b2ac0, fd=33) at sessionserver.cc:1455
>     #8 0x00ba2592 in runCoreDispatcher (default_t=<value optimized out>, flags=-1) at fds.cc:889
>     #9 0x00ba2ff7 in DSEvntFds::runDispatcher () at fds.cc:945
>     #10 0x080559c6 in main () at sessionserver.cc:1739
> 
> Here it is helpful to examine frames #0 and #1 in detail in relation to the relevant source code /libraries/liblmdb/mdb.c.
> 
> Frame #1
> 
> #1  0x0011c8ec in mdb_put (txn=0xdc13f008, dbi=2, key=0xffe52e74, data=0xffe52e6c, flags=0) at mdb.c:8771
>         mc = {mc_next = 0x0, mc_backup = 0x0, mc_xcursor = 0x0, mc_txn = 0xdc13f008, mc_dbi = 2, mc_db = 0xdc13f088, mc_dbx = 0xe1215038, mc_dbflag = 0xdc1cf09a "\033", '\032' <repeats 51 times>, mc_snum = 0, mc_top = 0, mc_flags = 0, mc_pg = {0x0, 0x80b2ac0, 0xffe52d68, 0x28afce, 0xdc13f008, 0x80ce9b0, 0xffe52dc8, 0x4c5ff4, 0x2b, 0x80b2ac0, 0xffe52d98, 0x49209a, 0x2b, 0x4c5ff4, 0x2121cb, 0x21576c, 0x8297d34, 0x3497af, 0x21576c, 0x21316a, 0x8297d34, 0x80ac7b0, 0x1e, 0x46eed6, 0x2b, 0x0, 0x80ce9b0, 0x4c5ff4, 0x80ac7b0, 0x8297d34, 0xffe52df8, 0x46fb96}, mc_ki = {0, 2089, 51120, 2058, 30, 0, 16796, 17, 11760, 65509, 3, 0, 32040, 2089, 30, 0, 0, 0, 0, 0, 3, 0, 24564, 76, 51120, 2058, 10944, 2059, 11800, 65509, 64758, 70}} <--------
> 	
>         mx = {mx_cursor = {mc_next = 0x38, mc_backup = 0x3, mc_xcursor = 0x0, mc_txn = 0x0, mc_dbi = 0, mc_db = 0x28, mc_dbx = 0x0, mc_dbflag = 0x0, mc_snum = 3, mc_top = 0, mc_flags = 17, mc_pg = {0x3a93d8, 0x10, 0x0, 0x18, 0x3a93a0, 0x0, 0x3a93d0, 0x289abe, 0x3a93a0, 0xffe52d3f, 0xffe52c68, 0x28afce, 0xffe52dbc, 0xffe52db4, 0x3, 0x4c5ff4, 0x11, 0x36c99b, 0x6e, 0x77, 0x7c, 0x5b, 0x38, 0x289abe, 0x0, 0x0, 0x0, 0x28, 0x0, 0x0, 0x3, 0x10}, mc_ki = {37848, 58, 51611, 54, 110, 0, 119, 0, 124, 0, 91, 0, 58, 0, 39614, 40, 0, 0, 0, 0, 0, 0, 160, 0, 0, 0, 2, 0, 18, 0, 136, 0}}, mx_db = {md_pad = 3838936, md_flags = 51611, md_depth = 54, md_branch_pages = 110, md_leaf_pages = 119, md_overflow_pages = 124, md_entries = 91, md_root = 56}, mx_dbx = {md_name = {mv_size = 6, mv_data = 0x0}, md_cmp = 0, md_dcmp = 0, md_rel = 0x40, md_relctx = 0x0}, mx_dbflag = 0 '\000'}
>         rc = 0 
> 
> And source code for mdb_put() is from mdb.c
> 
> int
> mdb_put(MDB_txn *txn, MDB_dbi dbi,
>     MDB_val *key, MDB_val *data, unsigned int flags)
> {
> 	MDB_cursor mc;
> 	MDB_xcursor mx;
> 	int rc;
> 
> 	if (!key || !data || !TXN_DBI_EXIST(txn, dbi, DB_USRVALID))
> 		return EINVAL;
> 
> 	if (flags & ~(MDB_NOOVERWRITE|MDB_NODUPDATA|MDB_RESERVE|MDB_APPEND|MDB_APPENDDUP))
> 		return EINVAL;
> 
> 	if (txn->mt_flags & (MDB_TXN_RDONLY|MDB_TXN_BLOCKED))
> 		return (txn->mt_flags & MDB_TXN_RDONLY) ? EACCES : MDB_BAD_TXN;
> 
> 	mdb_cursor_init(&mc, txn, dbi, &mx); <-------- see source code below
> 	mc.mc_next = txn->mt_cursors[dbi];
> 	txn->mt_cursors[dbi] = &mc;
> 	rc = mdb_cursor_put(&mc, key, data, flags); <-------- see Frame #0
> 	txn->mt_cursors[dbi] = mc.mc_next;
> 	return rc;
> }
> 
> 
> Source Code For mdb_cursor_init() From mdb.c
> /** Initialize a cursor for a given transaction and database. */
> static void
> mdb_cursor_init(MDB_cursor *mc, MDB_txn *txn, MDB_dbi dbi, MDB_xcursor *mx)
> {
> 	mc->mc_next = NULL;
> 	mc->mc_backup = NULL;
> 	mc->mc_dbi = dbi;
> 	mc->mc_txn = txn;
> 	mc->mc_db = &txn->mt_dbs[dbi];
> 	mc->mc_dbx = &txn->mt_dbxs[dbi];
> 	mc->mc_dbflag = &txn->mt_dbflags[dbi];
> 	mc->mc_snum = 0; <-------- 
> 	mc->mc_top = 0; <--------  
> 	mc->mc_pg[0] = 0; <-------- 
> 	mc->mc_ki[0] = 0;
> 	mc->mc_flags = 0;
> 	if (txn->mt_dbs[dbi].md_flags & MDB_DUPSORT) {
> 		mdb_tassert(txn, mx != NULL);
> 		mc->mc_xcursor = mx;
> 		mdb_xcursor_init0(mc);
> 	} else {
> 		mc->mc_xcursor = NULL;
> 	}
> 	if (*mc->mc_dbflag & DB_STALE) { <-------- false *mc->mc_dbflag = 27, DB_STALE = 0x02		/**< Named-DB record is older than txnID */

27 = 0x1B, this DB is stale and this test is true, not false.

> 		mdb_page_search(mc, NULL, MDB_PS_ROOTONLY); 
> 	}
> }
> 
> From this we know that when mdb_cursor_put() is called, the following values remain assigned:
> 
>               mc->mc_snum = 0; <--------
> 	mc->mc_top = 0; <--------  
> 	mc->mc_pg[0] = 0; <-------- 
> 
> and its noted these two values still have their initial values at the time of the segmentation fault.
> 
> (gdb) fr 0
> #0 0x0011b62e in mdb_cursor_put (mc=0xffe52d18, key=0xffe52e74, data=0xffe52e6c, flags=0) at mdb.c:6688
> 6688 in mdb.c
> (gdb) p mc->mc_top
> $1 = 0
> (gdb) p mc->mc_pg[mc->mc_top]
> $19 = (MDB_page *) 0x0
> 
> Let us consider how the path taken through mdb_cursor_put() could account for these null variable values which result in a segmentation fault.
> 
> Frame #0
> 
> #0  0x0011b62e in mdb_cursor_put (mc=0xffe52d18, key=0xffe52e74, data=0xffe52e6c, flags=0) at mdb.c:6688 <-------- flags = 0
>         env = 0x80ce9b0
>         leaf = <value optimized out>
>         fp = <value optimized out>
>         mp = 0x38
>         sub_root = 0x0 
>         fp_flags = <value optimized out>
>         xdata = {mv_size = 3838880, mv_data = 0xdc1cf09a}
>         rdata = 0xffe52e6c
>         dkey = {mv_size = 0, mv_data = 0x3a93cc}
>         olddata = {mv_size = 22, mv_data = 0x36c820}
>         dummy = {md_pad = 22, md_flags = 11128, md_depth = 65509, md_branch_pages = 16, md_leaf_pages = 1507329, md_overflow_pages = 0, md_entries = 3838928, md_root = 3833844}
>         do_sub = 0
>         insert_key = -30798 <-------- #define MDB_NOTFOUND	(-30798)
>         insert_data = -30798 <-------- #define MDB_NOTFOUND	(-30798)
>         mcount = 0
>         dcount = 0
>         nospill = 0 <--------
>         nsize = <value optimized out>
>         rc = 0 
>         rc2 = <value optimized out>
>         nflags = 0 <--------
>         __func__ = "mdb_cursor_put"	
> 
> We know that mc->mc_top and mc->mc_pg[0] were initialized to 0 in mdb_cursor_init(). As my annotations of the source code below suggest, careful analysis reveals that the paths actually taken through mdb_cursor_put() do not modify these assignments.
> 
> Excerpted Source Code For mdb_cursor_put() From mdb.c
> 
> int
> mdb_cursor_put(MDB_cursor *mc, MDB_val *key, MDB_val *data,
>     unsigned int flags)
> {
> 	MDB_env		*env;
> 	MDB_node	*leaf = NULL;
> 	MDB_page	*fp, *mp, *sub_root = NULL;
> 	uint16_t	fp_flags;
> 	MDB_val		xdata, *rdata, dkey, olddata;
> 	MDB_db dummy;
> 	int do_sub = 0, insert_key, insert_data;
> 	unsigned int mcount = 0, dcount = 0, nospill;
> 	size_t nsize;
> 	int rc, rc2;
> 	unsigned int nflags;
> 	DKBUF;
> 
> 	if (mc == NULL || key == NULL)
> 		return EINVAL;
> 
> 	env = mc->mc_txn->mt_env;
> 
> 	/* Check this first so counter will always be zero on any
> 	 * early failures.
> 	 */
> 	if (flags & MDB_MULTIPLE) {
> 		dcount = data[1].mv_size;
> 		data[1].mv_size = 0;
> 		if (!F_ISSET(mc->mc_db->md_flags, MDB_DUPFIXED))
> 			return MDB_INCOMPATIBLE;
> 	}
> 
> 	nospill = flags & MDB_NOSPILL;
> 	flags &= ~MDB_NOSPILL;
> 
> 	if (mc->mc_txn->mt_flags & (MDB_TXN_RDONLY|MDB_TXN_BLOCKED))
> 		return (mc->mc_txn->mt_flags & MDB_TXN_RDONLY) ? EACCES : MDB_BAD_TXN;
> 
> 	if (key->mv_size-1 >= ENV_MAXKEY(env))
> 		return MDB_BAD_VALSIZE;
> 
> #if SIZE_MAX > MAXDATASIZE
> 	if (data->mv_size > ((mc->mc_db->md_flags & MDB_DUPSORT) ? ENV_MAXKEY(env) : MAXDATASIZE))
> 		return MDB_BAD_VALSIZE;
> #else
> 	if ((mc->mc_db->md_flags & MDB_DUPSORT) && data->mv_size > ENV_MAXKEY(env))
> 		return MDB_BAD_VALSIZE;
> #endif
> 
> 	DPRINTF(("==> put db %d key [%s], size %"Z"u, data size %"Z"u",
> 		DDBI(mc), DKEY(key), key ? key->mv_size : 0, data->mv_size));
> 
> 	dkey.mv_size = 0;
> 
> 	if (flags == MDB_CURRENT) { <-------- false flags = 0
> 		if (!(mc->mc_flags & C_INITIALIZED))
> 			return EINVAL;
> 		rc = MDB_SUCCESS;
> 	} else if (mc->mc_db->md_root == P_INVALID) { <-------- false, mc->mc_db->md_root = 68

I might have missed it, did you show the entire contents of *mc->mc_db somewhere?

> 		/* new database, cursor has nothing to point to */
> 		mc->mc_snum = 0;
> 		mc->mc_top = 0;
> 		mc->mc_flags &= ~C_INITIALIZED;
> 		rc = MDB_NO_ROOT;  
> 	} else { <-------- executed
> 		int exact = 0;
> 		MDB_val d2;
> 		if (flags & MDB_APPEND) { <-------- false flags = 0
> 			MDB_val k2;
> 			rc = mdb_cursor_last(mc, &k2, &d2);
> 			if (rc == 0) {
> 				rc = mc->mc_dbx->md_cmp(key, &k2);
> 				if (rc > 0) {
> 					rc = MDB_NOTFOUND;
> 					mc->mc_ki[mc->mc_top]++;
> 				} else {
> 					/* new key is <= last key */
> 					rc = MDB_KEYEXIST;
> 				}
> 			}
> 		} else {<-------- executed
> 			rc = mdb_cursor_set(mc, key, &d2, MDB_SET, &exact); <-------- returns -30798 which is MDB_NOTFOUND

This is an important step. mdb_cursor_set only returns MDB_NOTFOUND when mc->mc_pg[mc->mc_top] != NULL.
There are no code paths that can return this error code without setting the rootpage pointer mc->mc_pg[0] non-NULL.
The only path that can allow this is if mc->mc_db->md_root == P_INVALID (the tree is empty) and we already know
that's not true in this case. Nothing in LMDB will change these values between the time mdb_cursor_set
returned and the time that mdb_cursor_put tries to reference mc->mc_pg[]. As such this sounds like a
memory overwrite corruption from some other thread, unrelated to LMDB.

-- 
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 26 Howard Chu 2020-03-08 15:33:55 UTC
hyc@symas.com wrote:
> Robins George wrote:
>> Hello Howard,
> 
> Hello again. Revisiting this trace and walkthrough:
> 
>> we are not able to reproduce the issue consistently so test code is not available at this time, and I do believe this issue will be present in the latest code as well since Myk also reported the same..  Attaching analysis from our side, hope it helps.
>>
>> The call stack what we have seen is,
>>
>>     #0 0x0011b62e in mdb_cursor_put (mc=0xffe52d18, key=0xffe52e74, data=0xffe52e6c, flags=0) at mdb.c:6688
>>     #1 0x0011c8ec in mdb_put (txn=0xdc13f008, dbi=2, key=0xffe52e74, data=0xffe52e6c, flags=0) at mdb.c:8771
>>     #2 0x00b792b8 in LMDB::LMDBContext::createSession (this=0x80ca418, sid=0x8297be4 "sidebf3708f0fd9fcb8e062529da47adf51402dfc4600000000+") at lmdbint.cc:460
>>     #3 0x08053564 in updateLMDB (request=..., forward=@0xffe5315f) at sessionserver.cc:1075
>>     #4 processRequestWithoutResponse (request=..., forward=@0xffe5315f) at sessionserver.cc:1160
>>     #5 0x08056a80 in processRequest (this=0x80b2ac0, fd=33) at sessionserver.cc:1189
>>     #6 readFromSock (this=0x80b2ac0, fd=33) at sessionserver.cc:1342
>>     #7 SockHandler::ioReady (this=0x80b2ac0, fd=33) at sessionserver.cc:1455
>>     #8 0x00ba2592 in runCoreDispatcher (default_t=<value optimized out>, flags=-1) at fds.cc:889
>>     #9 0x00ba2ff7 in DSEvntFds::runDispatcher () at fds.cc:945
>>     #10 0x080559c6 in main () at sessionserver.cc:1739
>>
>> Here it is helpful to examine frames #0 and #1 in detail in relation to the relevant source code /libraries/liblmdb/mdb.c.
>>
>> Frame #1
>>
>> #1  0x0011c8ec in mdb_put (txn=0xdc13f008, dbi=2, key=0xffe52e74, data=0xffe52e6c, flags=0) at mdb.c:8771
>>         mc = {mc_next = 0x0, mc_backup = 0x0, mc_xcursor = 0x0, mc_txn = 0xdc13f008, mc_dbi = 2, mc_db = 0xdc13f088, mc_dbx = 0xe1215038, mc_dbflag = 0xdc1cf09a "\033", '\032' <repeats 51 times>, mc_snum = 0, mc_top = 0, mc_flags = 0, mc_pg = {0x0, 0x80b2ac0, 0xffe52d68, 0x28afce, 0xdc13f008, 0x80ce9b0, 0xffe52dc8, 0x4c5ff4, 0x2b, 0x80b2ac0, 0xffe52d98, 0x49209a, 0x2b, 0x4c5ff4, 0x2121cb, 0x21576c, 0x8297d34, 0x3497af, 0x21576c, 0x21316a, 0x8297d34, 0x80ac7b0, 0x1e, 0x46eed6, 0x2b, 0x0, 0x80ce9b0, 0x4c5ff4, 0x80ac7b0, 0x8297d34, 0xffe52df8, 0x46fb96}, mc_ki = {0, 2089, 51120, 2058, 30, 0, 16796, 17, 11760, 65509, 3, 0, 32040, 2089, 30, 0, 0, 0, 0, 0, 3, 0, 24564, 76, 51120, 2058, 10944, 2059, 11800, 65509, 64758, 70}} <--------
>> 	
>>         mx = {mx_cursor = {mc_next = 0x38, mc_backup = 0x3, mc_xcursor = 0x0, mc_txn = 0x0, mc_dbi = 0, mc_db = 0x28, mc_dbx = 0x0, mc_dbflag = 0x0, mc_snum = 3, mc_top = 0, mc_flags = 17, mc_pg = {0x3a93d8, 0x10, 0x0, 0x18, 0x3a93a0, 0x0, 0x3a93d0, 0x289abe, 0x3a93a0, 0xffe52d3f, 0xffe52c68, 0x28afce, 0xffe52dbc, 0xffe52db4, 0x3, 0x4c5ff4, 0x11, 0x36c99b, 0x6e, 0x77, 0x7c, 0x5b, 0x38, 0x289abe, 0x0, 0x0, 0x0, 0x28, 0x0, 0x0, 0x3, 0x10}, mc_ki = {37848, 58, 51611, 54, 110, 0, 119, 0, 124, 0, 91, 0, 58, 0, 39614, 40, 0, 0, 0, 0, 0, 0, 160, 0, 0, 0, 2, 0, 18, 0, 136, 0}}, mx_db = {md_pad = 3838936, md_flags = 51611, md_depth = 54, md_branch_pages = 110, md_leaf_pages = 119, md_overflow_pages = 124, md_entries = 91, md_root = 56}, mx_dbx = {md_name = {mv_size = 6, mv_data = 0x0}, md_cmp = 0, md_dcmp = 0, md_rel = 0x40, md_relctx = 0x0}, mx_dbflag = 0 '\000'}
>>         rc = 0 
>>
>> And source code for mdb_put() is from mdb.c
>>
>> int
>> mdb_put(MDB_txn *txn, MDB_dbi dbi,
>>     MDB_val *key, MDB_val *data, unsigned int flags)
>> {
>> 	MDB_cursor mc;
>> 	MDB_xcursor mx;
>> 	int rc;
>>
>> 	if (!key || !data || !TXN_DBI_EXIST(txn, dbi, DB_USRVALID))
>> 		return EINVAL;
>>
>> 	if (flags & ~(MDB_NOOVERWRITE|MDB_NODUPDATA|MDB_RESERVE|MDB_APPEND|MDB_APPENDDUP))
>> 		return EINVAL;
>>
>> 	if (txn->mt_flags & (MDB_TXN_RDONLY|MDB_TXN_BLOCKED))
>> 		return (txn->mt_flags & MDB_TXN_RDONLY) ? EACCES : MDB_BAD_TXN;
>>
>> 	mdb_cursor_init(&mc, txn, dbi, &mx); <-------- see source code below
>> 	mc.mc_next = txn->mt_cursors[dbi];
>> 	txn->mt_cursors[dbi] = &mc;
>> 	rc = mdb_cursor_put(&mc, key, data, flags); <-------- see Frame #0
>> 	txn->mt_cursors[dbi] = mc.mc_next;
>> 	return rc;
>> }
>>
>>
>> Source Code For mdb_cursor_init() From mdb.c
>> /** Initialize a cursor for a given transaction and database. */
>> static void
>> mdb_cursor_init(MDB_cursor *mc, MDB_txn *txn, MDB_dbi dbi, MDB_xcursor *mx)
>> {
>> 	mc->mc_next = NULL;
>> 	mc->mc_backup = NULL;
>> 	mc->mc_dbi = dbi;
>> 	mc->mc_txn = txn;
>> 	mc->mc_db = &txn->mt_dbs[dbi];
>> 	mc->mc_dbx = &txn->mt_dbxs[dbi];
>> 	mc->mc_dbflag = &txn->mt_dbflags[dbi];
>> 	mc->mc_snum = 0; <-------- 
>> 	mc->mc_top = 0; <--------  
>> 	mc->mc_pg[0] = 0; <-------- 
>> 	mc->mc_ki[0] = 0;
>> 	mc->mc_flags = 0;
>> 	if (txn->mt_dbs[dbi].md_flags & MDB_DUPSORT) {
>> 		mdb_tassert(txn, mx != NULL);
>> 		mc->mc_xcursor = mx;
>> 		mdb_xcursor_init0(mc);
>> 	} else {
>> 		mc->mc_xcursor = NULL;
>> 	}
>> 	if (*mc->mc_dbflag & DB_STALE) { <-------- false *mc->mc_dbflag = 27, DB_STALE = 0x02		/**< Named-DB record is older than txnID */
> 
> 27 = 0x1B, this DB is stale and this test is true, not false.
> 
>> 		mdb_page_search(mc, NULL, MDB_PS_ROOTONLY); 
>> 	}
>> }
>>
>> From this we know that when mdb_cursor_put() is called, the following values remain assigned:
>>
>>               mc->mc_snum = 0; <--------
>> 	mc->mc_top = 0; <--------  
>> 	mc->mc_pg[0] = 0; <-------- 
>>
>> and its noted these two values still have their initial values at the time of the segmentation fault.
>>
>> (gdb) fr 0
>> #0 0x0011b62e in mdb_cursor_put (mc=0xffe52d18, key=0xffe52e74, data=0xffe52e6c, flags=0) at mdb.c:6688
>> 6688 in mdb.c
>> (gdb) p mc->mc_top
>> $1 = 0
>> (gdb) p mc->mc_pg[mc->mc_top]
>> $19 = (MDB_page *) 0x0
>>
>> Let us consider how the path taken through mdb_cursor_put() could account for these null variable values which result in a segmentation fault.
>>
>> Frame #0
>>
>> #0  0x0011b62e in mdb_cursor_put (mc=0xffe52d18, key=0xffe52e74, data=0xffe52e6c, flags=0) at mdb.c:6688 <-------- flags = 0
>>         env = 0x80ce9b0
>>         leaf = <value optimized out>
>>         fp = <value optimized out>
>>         mp = 0x38
>>         sub_root = 0x0 
>>         fp_flags = <value optimized out>
>>         xdata = {mv_size = 3838880, mv_data = 0xdc1cf09a}
>>         rdata = 0xffe52e6c
>>         dkey = {mv_size = 0, mv_data = 0x3a93cc}
>>         olddata = {mv_size = 22, mv_data = 0x36c820}
>>         dummy = {md_pad = 22, md_flags = 11128, md_depth = 65509, md_branch_pages = 16, md_leaf_pages = 1507329, md_overflow_pages = 0, md_entries = 3838928, md_root = 3833844}
>>         do_sub = 0
>>         insert_key = -30798 <-------- #define MDB_NOTFOUND	(-30798)
>>         insert_data = -30798 <-------- #define MDB_NOTFOUND	(-30798)
>>         mcount = 0
>>         dcount = 0
>>         nospill = 0 <--------
>>         nsize = <value optimized out>
>>         rc = 0 
>>         rc2 = <value optimized out>
>>         nflags = 0 <--------
>>         __func__ = "mdb_cursor_put"	
>>
>> We know that mc->mc_top and mc->mc_pg[0] were initialized to 0 in mdb_cursor_init(). As my annotations of the source code below suggest, careful analysis reveals that the paths actually taken through mdb_cursor_put() do not modify these assignments.
>>
>> Excerpted Source Code For mdb_cursor_put() From mdb.c
>>
>> int
>> mdb_cursor_put(MDB_cursor *mc, MDB_val *key, MDB_val *data,
>>     unsigned int flags)
>> {
>> 	MDB_env		*env;
>> 	MDB_node	*leaf = NULL;
>> 	MDB_page	*fp, *mp, *sub_root = NULL;
>> 	uint16_t	fp_flags;
>> 	MDB_val		xdata, *rdata, dkey, olddata;
>> 	MDB_db dummy;
>> 	int do_sub = 0, insert_key, insert_data;
>> 	unsigned int mcount = 0, dcount = 0, nospill;
>> 	size_t nsize;
>> 	int rc, rc2;
>> 	unsigned int nflags;
>> 	DKBUF;
>>
>> 	if (mc == NULL || key == NULL)
>> 		return EINVAL;
>>
>> 	env = mc->mc_txn->mt_env;
>>
>> 	/* Check this first so counter will always be zero on any
>> 	 * early failures.
>> 	 */
>> 	if (flags & MDB_MULTIPLE) {
>> 		dcount = data[1].mv_size;
>> 		data[1].mv_size = 0;
>> 		if (!F_ISSET(mc->mc_db->md_flags, MDB_DUPFIXED))
>> 			return MDB_INCOMPATIBLE;
>> 	}
>>
>> 	nospill = flags & MDB_NOSPILL;
>> 	flags &= ~MDB_NOSPILL;
>>
>> 	if (mc->mc_txn->mt_flags & (MDB_TXN_RDONLY|MDB_TXN_BLOCKED))
>> 		return (mc->mc_txn->mt_flags & MDB_TXN_RDONLY) ? EACCES : MDB_BAD_TXN;
>>
>> 	if (key->mv_size-1 >= ENV_MAXKEY(env))
>> 		return MDB_BAD_VALSIZE;
>>
>> #if SIZE_MAX > MAXDATASIZE
>> 	if (data->mv_size > ((mc->mc_db->md_flags & MDB_DUPSORT) ? ENV_MAXKEY(env) : MAXDATASIZE))
>> 		return MDB_BAD_VALSIZE;
>> #else
>> 	if ((mc->mc_db->md_flags & MDB_DUPSORT) && data->mv_size > ENV_MAXKEY(env))
>> 		return MDB_BAD_VALSIZE;
>> #endif
>>
>> 	DPRINTF(("==> put db %d key [%s], size %"Z"u, data size %"Z"u",
>> 		DDBI(mc), DKEY(key), key ? key->mv_size : 0, data->mv_size));
>>
>> 	dkey.mv_size = 0;
>>
>> 	if (flags == MDB_CURRENT) { <-------- false flags = 0
>> 		if (!(mc->mc_flags & C_INITIALIZED))
>> 			return EINVAL;
>> 		rc = MDB_SUCCESS;
>> 	} else if (mc->mc_db->md_root == P_INVALID) { <-------- false, mc->mc_db->md_root = 68
> 
> I might have missed it, did you show the entire contents of *mc->mc_db somewhere?
> 
>> 		/* new database, cursor has nothing to point to */
>> 		mc->mc_snum = 0;
>> 		mc->mc_top = 0;
>> 		mc->mc_flags &= ~C_INITIALIZED;
>> 		rc = MDB_NO_ROOT;  
>> 	} else { <-------- executed
>> 		int exact = 0;
>> 		MDB_val d2;
>> 		if (flags & MDB_APPEND) { <-------- false flags = 0
>> 			MDB_val k2;
>> 			rc = mdb_cursor_last(mc, &k2, &d2);
>> 			if (rc == 0) {
>> 				rc = mc->mc_dbx->md_cmp(key, &k2);
>> 				if (rc > 0) {
>> 					rc = MDB_NOTFOUND;
>> 					mc->mc_ki[mc->mc_top]++;
>> 				} else {
>> 					/* new key is <= last key */
>> 					rc = MDB_KEYEXIST;
>> 				}
>> 			}
>> 		} else {<-------- executed
>> 			rc = mdb_cursor_set(mc, key, &d2, MDB_SET, &exact); <-------- returns -30798 which is MDB_NOTFOUND
> 
> This is an important step. mdb_cursor_set only returns MDB_NOTFOUND when mc->mc_pg[mc->mc_top] != NULL.
> There are no code paths that can return this error code without setting the rootpage pointer mc->mc_pg[0] non-NULL.
> The only path that can allow this is if mc->mc_db->md_root == P_INVALID (the tree is empty) and we already know
> that's not true in this case. Nothing in LMDB will change these values between the time mdb_cursor_set
> returned and the time that mdb_cursor_put tries to reference mc->mc_pg[]. As such this sounds like a
> memory overwrite corruption from some other thread, unrelated to LMDB.
> 
You can verify this by adding an assert after the mdb_cursor_set() call:
		mdb_cassert(mc, rc != MDB_NOTFOUND || mc->mc_pg != NULL);

At this point I would check to see if your threads' stack sizes are large enough. Since this
cursor came from mdb_put, it lives on the stack, and a stack overrun is the most likely culprit.

-- 
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 27 mdufour 2024-03-15 20:15:34 UTC
We are also seeing rare instances of this crash since we released a version of our product which uses LMDB. Specifically, call stack is:

mdb_cursor_put(MDB_cursor * mc, MDB_val * key, MDB_val * data, unsigned int flags) Line 7998
mdb_put(MDB_txn * txn, unsigned int dbi, MDB_val * key, MDB_val * data, unsigned int flags) Line 10107

where line 8183 is

nsize = IS_LEAF2(mc->mc_pg[mc->mc_top]) ? key->mv_size : mdb_leaf_size(env, key, rdata);

and

mc->mc_top == 0
mc->mc_pg[0] == NULL
rc == -30798

Although we do not have a reproduction case, we do have a full crash dump with heap of an unoptimized debug build of our application. There is no evidence of stack corruption (in fact, mc->mc_pg[1] is still 0xcccccccccccccccc as per the msvc run-time check initialization).

Unfortunately we do not have the matching LMDB file.

Anything we can provide to help narrow down the issue?
Comment 28 mdufour 2024-03-15 20:17:35 UTC
Apologies, in the last message, the provide line of code is indeed 7998, the crash location (and not 8183 as written). It is slightly different from the official mdb.c due to some unrelated local changes earlier in the file.
Comment 29 Howard Chu 2024-03-15 21:19:12 UTC
(In reply to mdufour from comment #28)
> Apologies, in the last message, the provide line of code is indeed 7998, the
> crash location (and not 8183 as written). It is slightly different from the
> official mdb.c due to some unrelated local changes earlier in the file.

You didn't specify which version of LMDB you're using.
Comment 30 mdufour 2024-03-15 23:38:58 UTC
We're on revision ce200dca of the main openldap repo from Aug 27, 2023.
Comment 31 mdufour 2024-03-18 16:43:28 UTC
I am able to reproduce the crash in a scenario with two processes accessing the same LMDB file, where:

- process1 opens a named database.
- process2 drops this named database.
- process1 writes to the initial named database (using the dbi it was holding on to) -> this is where we crash.

It seems that mdb_page_search returns MDB_NOTFOUND because the named database is gone, leaving mc->mc_pg[0] NULL.
Comment 32 Howard Chu 2024-03-22 16:13:33 UTC
Created attachment 1015 [details]
Attempted reproduction
Comment 33 Howard Chu 2024-03-22 16:16:10 UTC
(In reply to mdufour from comment #31)
> I am able to reproduce the crash in a scenario with two processes accessing
> the same LMDB file, where:
> 
> - process1 opens a named database.
> - process2 drops this named database.
> - process1 writes to the initial named database (using the dbi it was
> holding on to) -> this is where we crash.
> 
> It seems that mdb_page_search returns MDB_NOTFOUND because the named
> database is gone, leaving mc->mc_pg[0] NULL.

Thanks for that info. Unfortunately I still can't reproduce that crash.
I've attached the test code I wrote based on your info.

It forks off a child to do the process2 actions. You must press RETURN when you're ready for process 1 to proceed. I just get more MDB_NOTFOUND results when process1 tries to write again.
Comment 34 mdufour 2024-03-24 04:12:55 UTC
Thanks to the test application, I was able to identify a key missing step in my description: process2 creates a named database (under a different name) after dropping the initial one. I can reproduce the crash by inserting the following lines @ 104:

	E(mdb_txn_begin(env, NULL, 0, &txn));
	E(mdb_dbi_open(txn, "id2", MDB_CREATE, &dbi));
	E(mdb_txn_commit(txn));
Comment 35 Howard Chu 2024-03-25 19:43:05 UTC
(In reply to mdufour from comment #34)
> Thanks to the test application, I was able to identify a key missing step in
> my description: process2 creates a named database (under a different name)
> after dropping the initial one. I can reproduce the crash by inserting the
> following lines @ 104:
> 
> 	E(mdb_txn_begin(env, NULL, 0, &txn));
> 	E(mdb_dbi_open(txn, "id2", MDB_CREATE, &dbi));
> 	E(mdb_txn_commit(txn));

OK, that reproduces it. This patch should fix it, please test, thanks:

diff --git a/libraries/liblmdb/mdb.c b/libraries/liblmdb/mdb.c
index 13d1aea39e..f0a65d97ab 100644
--- a/libraries/liblmdb/mdb.c
+++ b/libraries/liblmdb/mdb.c
@@ -6670,7 +6670,7 @@ mdb_page_search(MDB_cursor *mc, MDB_val *key, int flags)
 					MDB_node *leaf = mdb_node_search(&mc2,
 						&mc->mc_dbx->md_name, &exact);
 					if (!exact)
-						return MDB_NOTFOUND;
+						return MDB_BAD_DBI;
 					if ((leaf->mn_flags & (F_DUPDATA|F_SUBDATA)) != F_SUBDATA)
 						return MDB_INCOMPATIBLE; /* not a named DB */
 					rc = mdb_node_read(&mc2, leaf, &data);
Comment 36 mdufour 2024-03-26 12:03:06 UTC
This patch does fix the crash in my application repro case as well. We'll integrate it in our next minor release. Thanks much!
Comment 37 Howard Chu 2024-03-26 14:53:33 UTC
Fixed in git
Comment 38 Quanah Gibson-Mount 2024-03-26 16:27:24 UTC
RE0.9:

  • 83dc42c5 
by Howard Chu at 2024-03-26T14:52:42+00:00 
ITS#9037 mdb_page_search: fix error code when DBI record is missing