[Date Prev][Date Next] [Chronological] [Thread] [Top]

(ITS#7970) LMDB: Critical Heisenbug - Inconsistent reading & SIGSEGV due to the race condition.



Full_Name: Leonid Yuriev
Version: 2.4.40
OS: RHEL7
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (31.130.36.33)


In continue of ITS#7969.

Because of a race condition between readers and writer's activity for FreeDB
reclaiming in case it is nearly empty, some pages that used by reader, may be 
reused for a new txn. Thereby reader may get an unpredictable rubbish, throw an
assertion failure or SIGSEGV.

I am sure, the race condition is present, exactly between reader's mr_txnid and
last transaction mti_txnid, which is updated by the writer.

Let see to line 2525 of mdb_txn_renew0() in stable 2.4.40
http://www.openldap.org/devel/gitweb.cgi?p=openldap.git;a=blob;f=libraries/liblmdb/mdb.c;h=6cc343326efeb8543db073671e92625723daf938;hb=e648bdc22cd654d371a224a88973f986f67ba499#l2525

The code is:
txn->mt_txnid = r-EmEmr_txnid = ti->mti_txnid;

>From a CPU's point of view it is:
step #1, reg = ti->mti_txnid /* load the last txn-id from environment to
register */
step #2, r->mr_txnid = reg /* write txn-id into the reader table */
step #3, txn->mt_txnid = reg

The r->mr_txnid is used by mdb_find_oldest() for lookup a txn-id that is
available for reclaim from FreeDB by mdb_page_alloc().


When the reader's thread doing these steps, the writer can commit a several
transactions and the ti->mti_txnid may be changed between steps 1 and 2. Before
the step 2, a writer is free to reclaim any records from FreeDB (except the
last).
Thereby, the writer can commit several new transactions and reclaim several
records from FreeDB, include the txn which the reader has begun using at the
step1.
In this case the reader may be read a pages that is not contain the such txn any
longer, but the reclaimed pages, which may contain the _anything_ from a new
transaction.

It is hard to reproduce the problem, but we can change the code without altering
its significance, for instance: 
diff --git a/libraries/liblmdb/mdb.c b/libraries/liblmdb/mdb.c
index 6cc3433..c501a2e 100644
--- a/libraries/liblmdb/mdb.c
+++ b/libraries/liblmdb/mdb.c
@@ -2018,6 +2018,8 @@ mdb_page_alloc(MDB_cursor *mc, int num, MDB_page **mp)
 		if (oldest <= last) {
 			if (!found_old) {
 				oldest = mdb_find_oldest(txn);
+				/* LY: catch heisenbug. */
+				mdb_tassert(txn, oldest >= env->me_pgoldest);
 				env->me_pgoldest = oldest;
 				found_old = 1;
 			}
@@ -2034,6 +2036,8 @@ mdb_page_alloc(MDB_cursor *mc, int num, MDB_page **mp)
 		if (oldest <= last) {
 			if (!found_old) {
 				oldest = mdb_find_oldest(txn);
+				/* LY: catch heisenbug. */
+				mdb_tassert(txn, oldest >= env->me_pgoldest);