[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#9037) observing crash in mdb_cursor_put()



Robins George wrote:
> Hello Howard,
> we are not able to reproduce the issue consistently so test code is not available at this time, and I do believe this issue will be present in the latest code as well since Myk also reported the same..  Attaching analysis from our side, hope it helps.

Thanks. Notes below...
> 
> The call stack what we have seen is,
> 
>     #0 0x0011b62e in mdb_cursor_put (mc=0xffe52d18, key=0xffe52e74, data=0xffe52e6c, flags=0) at mdb.c:6688
>     #1 0x0011c8ec in mdb_put (txn=0xdc13f008, dbi=2, key=0xffe52e74, data=0xffe52e6c, flags=0) at mdb.c:8771
>     #2 0x00b792b8 in LMDB::LMDBContext::createSession (this=0x80ca418, sid=0x8297be4 "sidebf3708f0fd9fcb8e062529da47adf51402dfc4600000000+") at lmdbint.cc:460
>     #3 0x08053564 in updateLMDB (request=..., forward=@0xffe5315f) at sessionserver.cc:1075
>     #4 processRequestWithoutResponse (request=..., forward=@0xffe5315f) at sessionserver.cc:1160
>     #5 0x08056a80 in processRequest (this=0x80b2ac0, fd=33) at sessionserver.cc:1189
>     #6 readFromSock (this=0x80b2ac0, fd=33) at sessionserver.cc:1342
>     #7 SockHandler::ioReady (this=0x80b2ac0, fd=33) at sessionserver.cc:1455
>     #8 0x00ba2592 in runCoreDispatcher (default_t=<value optimized out>, flags=-1) at fds.cc:889
>     #9 0x00ba2ff7 in DSEvntFds::runDispatcher () at fds.cc:945
>     #10 0x080559c6 in main () at sessionserver.cc:1739
> 
> Here it is helpful to examine frames #0 and #1 in detail in relation to the relevant source code /libraries/liblmdb/mdb.c.
> 
> Frame #1
> 
> #1  0x0011c8ec in mdb_put (txn=0xdc13f008, dbi=2, key=0xffe52e74, data=0xffe52e6c, flags=0) at mdb.c:8771
>         mc = {mc_next = 0x0, mc_backup = 0x0, mc_xcursor = 0x0, mc_txn = 0xdc13f008, mc_dbi = 2, mc_db = 0xdc13f088, mc_dbx = 0xe1215038, mc_dbflag = 0xdc1cf09a "\033", '\032' <repeats 51 times>, mc_snum = 0, mc_top = 0, mc_flags = 0, mc_pg = {0x0, 0x80b2ac0, 0xffe52d68, 0x28afce, 0xdc13f008, 0x80ce9b0, 0xffe52dc8, 0x4c5ff4, 0x2b, 0x80b2ac0, 0xffe52d98, 0x49209a, 0x2b, 0x4c5ff4, 0x2121cb, 0x21576c, 0x8297d34, 0x3497af, 0x21576c, 0x21316a, 0x8297d34, 0x80ac7b0, 0x1e, 0x46eed6, 0x2b, 0x0, 0x80ce9b0, 0x4c5ff4, 0x80ac7b0, 0x8297d34, 0xffe52df8, 0x46fb96}, mc_ki = {0, 2089, 51120, 2058, 30, 0, 16796, 17, 11760, 65509, 3, 0, 32040, 2089, 30, 0, 0, 0, 0, 0, 3, 0, 24564, 76, 51120, 2058, 10944, 2059, 11800, 65509, 64758, 70}} <--------
> 	
>         mx = {mx_cursor = {mc_next = 0x38, mc_backup = 0x3, mc_xcursor = 0x0, mc_txn = 0x0, mc_dbi = 0, mc_db = 0x28, mc_dbx = 0x0, mc_dbflag = 0x0, mc_snum = 3, mc_top = 0, mc_flags = 17, mc_pg = {0x3a93d8, 0x10, 0x0, 0x18, 0x3a93a0, 0x0, 0x3a93d0, 0x289abe, 0x3a93a0, 0xffe52d3f, 0xffe52c68, 0x28afce, 0xffe52dbc, 0xffe52db4, 0x3, 0x4c5ff4, 0x11, 0x36c99b, 0x6e, 0x77, 0x7c, 0x5b, 0x38, 0x289abe, 0x0, 0x0, 0x0, 0x28, 0x0, 0x0, 0x3, 0x10}, mc_ki = {37848, 58, 51611, 54, 110, 0, 119, 0, 124, 0, 91, 0, 58, 0, 39614, 40, 0, 0, 0, 0, 0, 0, 160, 0, 0, 0, 2, 0, 18, 0, 136, 0}}, mx_db = {md_pad = 3838936, md_flags = 51611, md_depth = 54, md_branch_pages = 110, md_leaf_pages = 119, md_overflow_pages = 124, md_entries = 91, md_root = 56}, mx_dbx = {md_name = {mv_size = 6, mv_data = 0x0}, md_cmp = 0, md_dcmp = 0, md_rel = 0x40, md_relctx = 0x0}, mx_dbflag = 0 '\000'}
>         rc = 0 
> 
> And source code for mdb_put() is from mdb.c
> 
> int
> mdb_put(MDB_txn *txn, MDB_dbi dbi,
>     MDB_val *key, MDB_val *data, unsigned int flags)
> {
> 	MDB_cursor mc;
> 	MDB_xcursor mx;
> 	int rc;
> 
> 	if (!key || !data || !TXN_DBI_EXIST(txn, dbi, DB_USRVALID))
> 		return EINVAL;
> 
> 	if (flags & ~(MDB_NOOVERWRITE|MDB_NODUPDATA|MDB_RESERVE|MDB_APPEND|MDB_APPENDDUP))
> 		return EINVAL;
> 
> 	if (txn->mt_flags & (MDB_TXN_RDONLY|MDB_TXN_BLOCKED))
> 		return (txn->mt_flags & MDB_TXN_RDONLY) ? EACCES : MDB_BAD_TXN;
> 
> 	mdb_cursor_init(&mc, txn, dbi, &mx); <-------- see source code below
> 	mc.mc_next = txn->mt_cursors[dbi];
> 	txn->mt_cursors[dbi] = &mc;
> 	rc = mdb_cursor_put(&mc, key, data, flags); <-------- see Frame #0
> 	txn->mt_cursors[dbi] = mc.mc_next;
> 	return rc;
> }
> 
> 
> Source Code For mdb_cursor_init() From mdb.c
> /** Initialize a cursor for a given transaction and database. */
> static void
> mdb_cursor_init(MDB_cursor *mc, MDB_txn *txn, MDB_dbi dbi, MDB_xcursor *mx)
> {
> 	mc->mc_next = NULL;
> 	mc->mc_backup = NULL;
> 	mc->mc_dbi = dbi;
> 	mc->mc_txn = txn;
> 	mc->mc_db = &txn->mt_dbs[dbi];
> 	mc->mc_dbx = &txn->mt_dbxs[dbi];
> 	mc->mc_dbflag = &txn->mt_dbflags[dbi];
> 	mc->mc_snum = 0; <-------- 
> 	mc->mc_top = 0; <--------  
> 	mc->mc_pg[0] = 0; <-------- 
> 	mc->mc_ki[0] = 0;
> 	mc->mc_flags = 0;
> 	if (txn->mt_dbs[dbi].md_flags & MDB_DUPSORT) {
> 		mdb_tassert(txn, mx != NULL);
> 		mc->mc_xcursor = mx;
> 		mdb_xcursor_init0(mc);
> 	} else {
> 		mc->mc_xcursor = NULL;
> 	}
> 	if (*mc->mc_dbflag & DB_STALE) { <-------- false *mc->mc_dbflag = 27, DB_STALE = 0x02		/**< Named-DB record is older than txnID */

27 = 0x1b, so DB_STALE is true. mdb_page_search ought to have been invoked to set the root of the cursor.
> 		mdb_page_search(mc, NULL, MDB_PS_ROOTONLY); 
> 	}
> }
> 
> From this we know that when mdb_cursor_put() is called, the following values remain assigned:
> 
>               mc->mc_snum = 0; <--------
> 	mc->mc_top = 0; <--------  
> 	mc->mc_pg[0] = 0; <-------- 
> 
> and its noted these two values still have their initial values at the time of the segmentation fault.

There is no issue with mdb_cursor_put then. The question is why didn't mdb_page_search find the DBI's root node?

-- 
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/