OpenLDAP
Up to top level
Build   Contrib   Development   Documentation   Historical   Incoming   Software Bugs   Software Enhancements   Web  

Logged in as guest

Viewing Software Bugs/7969
Full headers

From: leo@yuriev.ru
Subject: LMDB: Globally shared fields of meta-data are not 'volatile'.
Compose comment
Download message
State:
0 replies:
13 followups: 1 2 3 4 5 6 7 8 9 10 11 12 13

Major security issue: yes  no

Notes:

Notification:


Date: Fri, 17 Oct 2014 22:19:21 +0000
From: leo@yuriev.ru
To: openldap-its@OpenLDAP.org
Subject: LMDB: Globally shared fields of meta-data are not 'volatile'.
Full_Name: Leonid Yuriev
Version: 2.4.40
OS: RHEL7
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (31.130.36.33)


Several fields from the meta-data that shared globally across active writer and
all readers are not 'volatile'. This allow compiler to cache its values in
registers and reorder writes.

More over, in some cases compiler is free to eliminate the calls of function
which seems to be 'const' or 'pure', pls see http://lwn.net/Articles/285332/. 

Thereby a series of  Heisenbugs may be induced, for instance:

1) The mdb_find_oldest() calls may be potentially eliminated up to public (not
static) functions due to interprocedural optimization 'unit at once'.

2) Reordering of CPU instructions which updates the txn in a meta-page. Without
"volatile" or memory-barrier compiler may reorder instructions for update the
"mm_txnid" field in meta-page in "writemap" mode.
    
Thereby,  from the reader's point of view this cause a short time interval when
the transaction is corrupted.

On some architectures few actions are required for cache coherence, for instance
- GCC's __synchronize() must be called.

3) A really critical bugs will be described later in separate ITS.

Followup 1

Download message
Date: Sat, 18 Oct 2014 02:22:31 +0400
Subject: Re: (ITS#7969) LMDB: Globally shared fields of meta-data are not 'volatile'.
From: =?UTF-8?B?0JvQtdC+0L3QuNC0INCu0YDRjNC10LI=?= <leo@yuriev.ru>
To: openldap-its@openldap.org
The attached files is derived from OpenLDAP Software. All of the modifications
to OpenLDAP Software represented in the following patch(es) were developed by
Peter-Service LLC, Moscow, Russia. Peter-Service LLC has not assigned rights
and/or interest in this work to any party. I, Leonid Yuriev am authorized by
Peter-Service LLC, my employer, to release this work under the following terms.

Peter-Service LLC hereby places the following modifications to OpenLDAP Software
(and only these modifications) into the public domain. Hence, these
modifications may be freely used and/or redistributed for any purpose with or
without attribution and/or other notice.


commit 5e83ea8a76f81dac8d5a9dcd74d5b10a01330102
Author: Leo Yuriev <leo@yuriev.ru>
Date:   2014-09-12 01:32:13 +0400

    ITS#7969 for LMDB: volatile & __synchronize().

diff --git a/libraries/liblmdb/mdb.c b/libraries/liblmdb/mdb.c
index 6cc3433..3286ffb 100644
--- a/libraries/liblmdb/mdb.c
+++ b/libraries/liblmdb/mdb.c
@@ -580,11 +580,11 @@ typedef struct MDB_rxbody {
  * started from so we can avoid overwriting any data used in that
  * particular version.
  */
- txnid_t mrb_txnid;
+ volatile txnid_t mrb_txnid;
  /** The process ID of the process owning this reader txn. */
- MDB_PID_T mrb_pid;
+ volatile MDB_PID_T mrb_pid;
  /** The thread ID of the thread owning this txn. */
- MDB_THR_T mrb_tid;
+ volatile MDB_THR_T mrb_tid;
 } MDB_rxbody;

  /** The actual reader record, with cacheline padding. */
@@ -632,12 +632,12 @@ typedef struct MDB_txbody {
  * This is recorded here only for convenience; the value can always
  * be determined by reading the main database meta pages.
  */
- txnid_t mtb_txnid;
+ volatile txnid_t mtb_txnid;
  /** The number of slots that have been used in the reader table.
  * This always records the maximum count, it is not decremented
  * when readers release their slots.
  */
- unsigned mtb_numreaders;
+ volatile unsigned mtb_numreaders;
 } MDB_txbody;

  /** The actual reader table definition. */
@@ -908,7 +908,7 @@ typedef struct MDB_meta {
  /** Any persistent environment flags. @ref mdb_env */
 #define mm_flags mm_dbs[0].md_flags
  pgno_t mm_last_pg; /**< last used page in file */
- txnid_t mm_txnid; /**< txnid that committed this page */
+ volatile txnid_t mm_txnid; /**< txnid that committed this page */
 } MDB_meta;

  /** Buffer for a stack-allocated meta page.
@@ -3559,6 +3559,10 @@ mdb_env_write_meta(MDB_txn *txn)
  mp->mm_dbs[0] = txn->mt_dbs[0];
  mp->mm_dbs[1] = txn->mt_dbs[1];
  mp->mm_last_pg = txn->mt_next_pgno - 1;
+#if !(defined(_MSC_VER) || defined(__i386__) || defined(__x86_64__))
+ /* LY: issue a memory barrier, if not x86. */
+ __sync_synchronize();
+#endif
  mp->mm_txnid = txn->mt_txnid;
  if (!(env->me_flags & (MDB_NOMETASYNC|MDB_NOSYNC))) {
  unsigned meta_size = env->me_psize;



Followup 2

Download message
Date: Sat, 18 Oct 2014 02:51:36 +0400
Subject: Re: (ITS#7969) LMDB: Globally shared fields of meta-data are not 'volatile'.
From: =?UTF-8?B?0JvQtdC+0L3QuNC0INCu0YDRjNC10LI=?= <leo@yuriev.ru>
To: openldap-its@openldap.org
The attached files is derived from OpenLDAP Software. All of the modifications
to OpenLDAP Software represented in the following patch(es) were developed by
Peter-Service LLC, Moscow, Russia. Peter-Service LLC has not assigned rights
and/or interest in this work to any party. I, Leonid Yuriev am authorized by
Peter-Service LLC, my employer, to release this work under the following terms.

Peter-Service LLC hereby places the following modifications to OpenLDAP Software
(and only these modifications) into the public domain. Hence, these
modifications may be freely used and/or redistributed for any purpose with or
without attribution and/or other notice.

https://gist.github.com/leo-yuriev/dcedd5e5b2a09c605108



Followup 3

Download message
Date: Sat, 18 Oct 2014 07:13:38 +0100
From: Howard Chu <hyc@symas.com>
To: leo@yuriev.ru, openldap-its@OpenLDAP.org
Subject: Re: (ITS#7969) LMDB: Globally shared fields of meta-data are not
 'volatile'.
leo@yuriev.ru wrote:
> The attached files is derived from OpenLDAP Software. All of the
modifications
> to OpenLDAP Software represented in the following patch(es) were developed
by
> Peter-Service LLC, Moscow, Russia. Peter-Service LLC has not assigned
rights
> and/or interest in this work to any party. I, Leonid Yuriev am authorized
by
> Peter-Service LLC, my employer, to release this work under the following
terms.
>
> Peter-Service LLC hereby places the following modifications to OpenLDAP
Software
> (and only these modifications) into the public domain. Hence, these
> modifications may be freely used and/or redistributed for any purpose with
or
> without attribution and/or other notice.

Committed to mdb.master. The whitespace is wrong in all of the patches you 
submitted, had to apply them manually.

>
> commit 5e83ea8a76f81dac8d5a9dcd74d5b10a01330102
> Author: Leo Yuriev <leo@yuriev.ru>
> Date:   2014-09-12 01:32:13 +0400
>
>      ITS#7969 for LMDB: volatile & __synchronize().
>
> diff --git a/libraries/liblmdb/mdb.c b/libraries/liblmdb/mdb.c
> index 6cc3433..3286ffb 100644
> --- a/libraries/liblmdb/mdb.c
> +++ b/libraries/liblmdb/mdb.c
> @@ -580,11 +580,11 @@ typedef struct MDB_rxbody {
>    * started from so we can avoid overwriting any data used in that
>    * particular version.
>    */
> - txnid_t mrb_txnid;
> + volatile txnid_t mrb_txnid;
>    /** The process ID of the process owning this reader txn. */
> - MDB_PID_T mrb_pid;
> + volatile MDB_PID_T mrb_pid;
>    /** The thread ID of the thread owning this txn. */
> - MDB_THR_T mrb_tid;
> + volatile MDB_THR_T mrb_tid;
>   } MDB_rxbody;
>
>    /** The actual reader record, with cacheline padding. */
> @@ -632,12 +632,12 @@ typedef struct MDB_txbody {
>    * This is recorded here only for convenience; the value can always
>    * be determined by reading the main database meta pages.
>    */
> - txnid_t mtb_txnid;
> + volatile txnid_t mtb_txnid;
>    /** The number of slots that have been used in the reader table.
>    * This always records the maximum count, it is not decremented
>    * when readers release their slots.
>    */
> - unsigned mtb_numreaders;
> + volatile unsigned mtb_numreaders;
>   } MDB_txbody;
>
>    /** The actual reader table definition. */
> @@ -908,7 +908,7 @@ typedef struct MDB_meta {
>    /** Any persistent environment flags. @ref mdb_env */
>   #define mm_flags mm_dbs[0].md_flags
>    pgno_t mm_last_pg; /**< last used page in file */
> - txnid_t mm_txnid; /**< txnid that committed this page */
> + volatile txnid_t mm_txnid; /**< txnid that committed this page */
>   } MDB_meta;
>
>    /** Buffer for a stack-allocated meta page.
> @@ -3559,6 +3559,10 @@ mdb_env_write_meta(MDB_txn *txn)
>    mp->mm_dbs[0] = txn->mt_dbs[0];
>    mp->mm_dbs[1] = txn->mt_dbs[1];
>    mp->mm_last_pg = txn->mt_next_pgno - 1;
> +#if !(defined(_MSC_VER) || defined(__i386__) || defined(__x86_64__))
> + /* LY: issue a memory barrier, if not x86. */
> + __sync_synchronize();
> +#endif
>    mp->mm_txnid = txn->mt_txnid;
>    if (!(env->me_flags & (MDB_NOMETASYNC|MDB_NOSYNC))) {
>    unsigned meta_size = env->me_psize;
>
>
>
>


-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/



Followup 4

Download message
Date: Sat, 18 Oct 2014 12:11:27 +0400
Subject: Re: (ITS#7969) LMDB: Globally shared fields of meta-data are not 'volatile'.
From: =?UTF-8?B?0JvQtdC+0L3QuNC0INCu0YDRjNC10LI=?= <leo@yuriev.ru>
To: Howard Chu <hyc@symas.com>
Cc: openldap-its@openldap.org
Hi, Howard.

I also noticed a problem with the whilespace, so that is also posted
patches on github's gist and gave references in each of three ITS.

https://gist.github.com/leo-yuriev/dcedd5e5b2a09c605108
https://gist.github.com/leo-yuriev/fc00313b8035cddba6a7
https://gist.github.com/leo-yuriev/6f1e1444c9e29bd94885

>
> Committed to mdb.master. The whitespace is wrong in all of the patches you
> submitted, had to apply them manually.
>
> --
>   -- Howard Chu
>   CTO, Symas Corp.           http://www.symas.com
>   Director, Highland Sun     http://highlandsun.com/hyc/
>   Chief Architect, OpenLDAP  http://www.openldap.org/project/



Followup 5

Download message
Date: Sat, 15 Nov 2014 22:04:27 +0100
From: Hallvard Breien Furuseth <h.b.furuseth@usit.uio.no>
To: leo@yuriev.ru
CC: openldap-its@OpenLDAP.org
Subject: Re: (ITS#7969) LMDB: Globally shared fields of meta-data are not
 'volatile'.
__sync_synchronize() needs an #ifdef around it.
#ifdef __GNUC__, or something more general?

Not sure what to do for the #else.  We could mostly
avoid using mm_txnid when the lockfile is initialized
and use mti_txnid instead (e.g. in pick_meta).  And
penalize MDB_NOLOCK with some sync call in the #else.

-- 
Hallvard




Followup 6

Download message
Date: Mon, 12 Jan 2015 22:15:06 +0400
Subject: Re: (ITS#7969) LMDB: Globally shared fields of meta-data are not 'volatile'.
From: =?UTF-8?B?0JvQtdC+0L3QuNC0INCu0YDRjNC10LI=?= <leo@yuriev.ru>
To: Hallvard Breien Furuseth <h.b.furuseth@usit.uio.no>
Cc: openldap-its@openldap.org
2014-11-16 0:04 GMT+03:00 Hallvard Breien Furuseth
<h.b.furuseth@usit.uio.no>:
> __sync_synchronize() needs an #ifdef around it.
> #ifdef __GNUC__, or something more general?
>
> Not sure what to do for the #else.  We could mostly
> avoid using mm_txnid when the lockfile is initialized
> and use mti_txnid instead (e.g. in pick_meta).  And
> penalize MDB_NOLOCK with some sync call in the #else.
>
> --
> Hallvard
>

Unfortunately IMDB programmed without accounting possibility of memory
reordering and cache incoherency (as a proof - the 'volatile' was
missed before ITS#7969).
So, currently seems LMDB is durable only on x86 and may be on MIPS.

I think memory-ordering (e.g. barriers) functions should be defined in
separate h-file, which will select implementation depends on cpu-arch
& compiler.
see http://en.wikipedia.org/wiki/Memory_ordering

The minimum seems to be required a memory_barrier_coherence() - to
make cache coherence and suppress dangerously load/store reordering,
e.g. flush/invalidate cache on some cpu-arch, but do nothing on x86
and so on.
Then __sync_synchronize() and MIPS's cache invalidate should be
replaced with memory_barrier_coherence().

But instead the original version recreate cleanly on the basis of
1Hippeus engine (manager of shared memory and queues for zero-copy
IPC, google it).

Leonid



Followup 7

Download message
Date: Tue, 13 Jan 2015 13:44:36 +0400
Subject: Re: (ITS#7969) LMDB: Globally shared fields of meta-data are not 'volatile'.
From: =?UTF-8?B?0JvQtdC+0L3QuNC0INCu0YDRjNC10LI=?= <leo@yuriev.ru>
To: Hallvard Breien Furuseth <h.b.furuseth@usit.uio.no>
Cc: openldap-its@openldap.org
2014-11-16 0:04 GMT+03:00 Hallvard Breien Furuseth
<h.b.furuseth@usit.uio.no>:
> __sync_synchronize() needs an #ifdef around it.
> #ifdef __GNUC__, or something more general?

I have implemented 'portable' memory barriers for a set of compilers.
https://github.com/ReOpen/ReOpenLDAP/commit/a1ec3a4a8290c84846b4e98ff3abf38839381866
Feedback welcomed.

Leonid.



Followup 8

Download message
Date: Tue, 13 Jan 2015 11:50:34 +0100
From: Hallvard Breien Furuseth <h.b.furuseth@usit.uio.no>
To: leo@yuriev.ru
CC: openldap-its@OpenLDAP.org
Subject: Re: (ITS#7969) LMDB: Globally shared fields of meta-data are not
 'volatile'.
On 01/12/2015 07:15 PM, leo@yuriev.ru wrote:
> Unfortunately IMDB programmed without accounting possibility of memory
> reordering and cache incoherency (as a proof - the 'volatile' was
> missed before ITS#7969).
> So, currently seems LMDB is durable only on x86 and may be on MIPS.

I gather "volatile" in thread code can be read as "warning: dubious
code" or at best "please access this sooner rather than later".

comp.programming.threads's advise is that volatile is useless for thread
sync:  You need sync primitives, and then you don't need volatile since
the compiler/hardware may not move the memory access past the primitive.

Unless the primitive's spec says so, I guess.  E.g. with an "asm"
statement where the compiler does not know what the assembly code does
and must be told not to mess with ordering.

The best fix is likely a format change which reduces the need for sync
primitives, for LDMBv2.  Checksum the MDB_meta, and/or replace the
variable part with a single word which refers to the meta.
Don't know if we can get rid of unportable sync code though.

Leaving the rest to Howard, I'm too busy just now to do more than blow
hot air:-)

-- 
Hallvard



Followup 9

Download message
Date: Tue, 13 Jan 2015 20:53:56 +0400
Subject: Re: (ITS#7969) LMDB: Globally shared fields of meta-data are not 'volatile'.
From: =?UTF-8?B?0JvQtdC+0L3QuNC0INCu0YDRjNC10LI=?= <leo@yuriev.ru>
To: Hallvard Breien Furuseth <h.b.furuseth@usit.uio.no>
Cc: openldap-its@openldap.org
2015-01-13 13:50 GMT+03:00 Hallvard Breien Furuseth
<h.b.furuseth@usit.uio.no>:
> On 01/12/2015 07:15 PM, leo@yuriev.ru wrote:
>>
>> Unfortunately IMDB programmed without accounting possibility of memory
>> reordering and cache incoherency (as a proof - the 'volatile' was
>> missed before ITS#7969).
>> So, currently seems LMDB is durable only on x86 and may be on MIPS.
>
>
> I gather "volatile" in thread code can be read as "warning: dubious
> code" or at best "please access this sooner rather than later".
>
> comp.programming.threads's advise is that volatile is useless for thread
> sync:  You need sync primitives, and then you don't need volatile since
> the compiler/hardware may not move the memory access past the primitive.

Excuse me, but seems you are misunderstand comp.programming.threads in
case of blocking/locking multithreading and non-blocking
multithreading (aka lock-free/wait-free).

Yes, `volatile` don't needed, when a locking (e.g. mutex) is used for
access to any variables that shared between threads.

But LMDB don't uses locking primitives for interaction between writers
and readers.
LMDB properties declared as:
- readers don't block writers and writers don't block readers;
- read transactions are extremely cheap, and can be performed using no
mallocs or any other _blocking_ calls;

THEREFORE 'volatile' and memory-ordering ISSUES MUST BE HANDLED VERY
CAREFULLY AND ANXIOUSLY.

> Unless the primitive's spec says so, I guess.  E.g. with an "asm"
> statement where the compiler does not know what the assembly code does
> and must be told not to mess with ordering.

Please study GCC's manual about __asm __volalile (:::"memory"), this
is just a "full compiler fence".

> The best fix is likely a format change which reduces the need for sync
> primitives, for LDMBv2.  Checksum the MDB_meta, and/or replace the
> variable part with a single word which refers to the meta.
> Don't know if we can get rid of unportable sync code though.
Yes it is possible, but requires C++, see
http://en.cppreference.com/w/cpp/atomic/memory_order
By using just C, include C99 - not possible, see
http://stackoverflow.com/questions/27301371/memory-order-consistency-model-and-c99

> Leaving the rest to Howard, I'm too busy just now to do more than blow
> hot air:-)

Ok ;)

I created gist for public review, please share it.
https://gist.github.com/leo-yuriev/ba186a6bf5cf3a27bae7

Leonid.



Followup 10

Download message
Date: Tue, 13 Jan 2015 19:23:33 +0100
From: Hallvard Breien Furuseth <h.b.furuseth@usit.uio.no>
To: leo@yuriev.ru
CC: openldap-its@OpenLDAP.org
Subject: Re: (ITS#7969) LMDB: Globally shared fields of meta-data are not
 'volatile'.
On 13/01/15 17:54, leo@yuriev.ru wrote:
> 2015-01-13 13:50 GMT+03:00 Hallvard Breien Furuseth
<h.b.furuseth@usit.uio.no>:
>> comp.programming.threads's advise is that volatile is useless for
thread
>> sync:  You need sync primitives, and then you don't need volatile since
>> the compiler/hardware may not move the memory access past the
primitive.
>
> Excuse me, but seems you are misunderstand comp.programming.threads in
> case of blocking/locking multithreading and non-blocking
> multithreading (aka lock-free/wait-free).

Yes, that didn't come out right.  I don't mind inserting the
volatile, but I don't know that it helps either.  As far as I
understand if it was broken without volatile, then it's still
broken with it - just hopefully less likely to break.  And LMDB
couldn't be releaseed with your original unportable sync code
along with the volatile.

> (...)
>> Unless the primitive's spec says so, I guess.  E.g. with an "asm"
>> statement where the compiler does not know what the assembly code does
>> and must be told not to mess with ordering.
>
> Please study GCC's manual about __asm __volalile (:::"memory"), this
> is just a "full compiler fence".

Thanks, looks good.  Will read the rest later.

-- 
Hallvard




Followup 11

Download message
Date: Wed, 14 Jan 2015 06:49:39 +0100
From: Hallvard Breien Furuseth <h.b.furuseth@usit.uio.no>
To: leo@yuriev.ru
CC: openldap-its@OpenLDAP.org
Subject: Re: (ITS#7969) LMDB: Globally shared fields of meta-data are not
 'volatile'.
On 13/01/15 19:23, Hallvard Breien Furuseth wrote:
> Yes, that didn't come out right.  I don't mind inserting the
> volatile, but I don't know that it helps either.  As far as I
> understand if it was broken without volatile, then it's still
> broken with it - just hopefully less likely to break.  And LMDB
> couldn't be releaseed with your original unportable sync code
> along with the volatile.

Sorry, nevermind.  Of course when the writer does sync well enough,
volatile + the txn_renew loop will have to do for a sync primitive in
the reader.  I suppose this requires that sync in the writer thread
will shake other threads as well, it won't be private to the writer.

-- 
Hallvard




Followup 12

Download message
Date: Wed, 14 Jan 2015 23:31:30 +0400
Subject: Re: (ITS#7969) LMDB: Globally shared fields of meta-data are not 'volatile'.
From: =?UTF-8?B?0JvQtdC+0L3QuNC0INCu0YDRjNC10LI=?= <leo@yuriev.ru>
To: Hallvard Breien Furuseth <h.b.furuseth@usit.uio.no>
Cc: openldap-its@openldap.org, Howard Chu <hyc@openldap.org>
2015-01-14 8:49 GMT+03:00 Hallvard Breien Furuseth
<h.b.furuseth@usit.uio.no>:
> On 13/01/15 19:23, Hallvard Breien Furuseth wrote:
>>
>> Yes, that didn't come out right.  I don't mind inserting the
>> volatile, but I don't know that it helps either.  As far as I
>> understand if it was broken without volatile, then it's still
>> broken with it - just hopefully less likely to break.  And LMDB
>> couldn't be releaseed with your original unportable sync code
>> along with the volatile.
>
>
> Sorry, nevermind.  Of course when the writer does sync well enough,
> volatile + the txn_renew loop will have to do for a sync primitive in
> the reader.

> I suppose this requires that sync in the writer thread
> will shake other threads as well, it won't be private to the writer.

In general this is wrong, more precisely depend on 'volatile' on
shared variables and usage of barriers/fences by the readers.

Sync-ops due lock/release a mutex by writer issue a memory-barrier for
its own thread.
With this compiler must update all modified variables, which shaded in
the CPU registers.
Next a hardware write-barrier (aka release) in the mutex-release code
enforce all changes to be visible for other threads (e.g. flush the
cache).
But 'be visible' here mean 'publish' and other threads can access
these changes, but if want this.

In general, to see changes made by the writer, all other threads
should issue a read-barrier (aka acquire).
On most arches such barrier just inform compiler that memory was
changed and variables which cached in the registers must be reloaded.
But in some cases (like Itanium) this barrier will be taken in account
for instruction scheduling.
For 'volatile' compiler should generate barriers each time on read or
write such variables.

More general, memory-barriers are very important to HPC, distributed
computing and for super-computers.
For example read-barriers may pull changes from internode-bus or other
nodes, and write-barriers - publish the local changes.

So, the one way to avoid a race bugs - thinking in terms of
publish/pull changes.



Followup 13

Download message
Date: Fri, 16 Jan 2015 13:26:52 +0000
From: Howard Chu <hyc@openldap.org>
To: =?UTF-8?B?0JvQtdC+0L3QuNC0INCu0YDRjNC10LI=?= <leo@yuriev.ru>,
 Hallvard Breien Furuseth <h.b.furuseth@usit.uio.no>
CC: openldap-its@openldap.org,
 "OpenLDAP-devel@openldap.org" <OpenLDAP-devel@openldap.org>
Subject: Re: (ITS#7969) LMDB: Globally shared fields of meta-data are
 not 'volatile'.
This discussion should be moved to the openldap-devel list as there's 
more being discussed now than the patch itself.

............ .......... wrote:
> 2015-01-14 8:49 GMT+03:00 Hallvard Breien Furuseth
<h.b.furuseth@usit.uio.no>:
>> On 13/01/15 19:23, Hallvard Breien Furuseth wrote:
>>>
>>> Yes, that didn't come out right.  I don't mind inserting the
>>> volatile, but I don't know that it helps either.  As far as I
>>> understand if it was broken without volatile, then it's still
>>> broken with it - just hopefully less likely to break.  And LMDB
>>> couldn't be releaseed with your original unportable sync code
>>> along with the volatile.
>>
>>
>> Sorry, nevermind.  Of course when the writer does sync well enough,
>> volatile + the txn_renew loop will have to do for a sync primitive in
>> the reader.
>
>> I suppose this requires that sync in the writer thread
>> will shake other threads as well, it won't be private to the writer.
>
> In general this is wrong, more precisely depend on 'volatile' on
> shared variables and usage of barriers/fences by the readers.

Actually 'volatile' is meaningless at the hardware level. It only serves 
to prevent the compiler from reordering or eliminating accesses. In the 
case of LMDB the compiler cannot eliminate accesses because they are to 
global memory, and the compiler cannot determine anything about the 
liveness or side-effects of global memory accesses.

> Sync-ops due lock/release a mutex by writer issue a memory-barrier for
> its own thread.
> With this compiler must update all modified variables, which shaded in
> the CPU registers.
> Next a hardware write-barrier (aka release) in the mutex-release code
> enforce all changes to be visible for other threads (e.g. flush the
> cache).
> But 'be visible' here mean 'publish' and other threads can access
> these changes, but if want this.

> In general, to see changes made by the writer, all other threads
> should issue a read-barrier (aka acquire).
> On most arches such barrier just inform compiler that memory was
> changed and variables which cached in the registers must be reloaded.
> But in some cases (like Itanium) this barrier will be taken in account
> for instruction scheduling.
> For 'volatile' compiler should generate barriers each time on read or
> write such variables.

No. volatile doesn't generate hardware barriers.

> More general, memory-barriers are very important to HPC, distributed
> computing and for super-computers.
> For example read-barriers may pull changes from internode-bus or other
> nodes, and write-barriers - publish the local changes.

> So, the one way to avoid a race bugs - thinking in terms of
> publish/pull changes.

"race bug" by definition occurs when multiple writers may modify a 
particular memory object, causing its value to be indeterminate to an 
observer. By definition, no such bugs can occur in LMDB because only 
single writers can ever modify any memory object.

In the case of arbitrary readers viewing writes, these are the possible 
cases:
    1) reader is on same CPU as writer, writes cached
       there is no issue, the reader sees what the writer wrote.
    2) reader is on same CPU as writer, writes not cached
       there is no issue, the reader must fetch data from global memory
    3) reader is on different CPU, writes not cached
       there is no issue, the reader's CPU must fetch the data from 
global memory - same as (2)
    4) reader is on different CPU, writes are cached
       the reader may see the cached/stale data, or the CPU may fetch 
the new data from global memory

Only case (4) has any ambiguity, and LMDB's reader table is specifically 
designed not to care about the ambiguity. I.e., whether fresh or stale 
data is seen is irrelevant, LMDB will operate correctly because it does 
not fresh data. Correct processing of the reader table only depends on 
the oldest data in the table, so staleness is an asset here.

Correct processing of changes to the meta page requires exclusion from 
other writers, so the write mutex is used. This also guarantees that all 
changes to the meta page are flushed to global memory before the next 
writer begins.

In ITS#7970 you discuss a "heisenbug" which can theoretically occur if 
multiple write txns complete in the span of 2 memory accesses by a 
reader. In practice such a bug cannot occur because the act of 
completing a write txn involves multiple blocking calls (I/O system 
calls, mutex acquisition/release, etc.) which will force writers to 
pause relative to any readers in question. In contrast the reader 
performs no blocking calls at all, and in the window of vulnerability it 
is performing only 2 single-word memory accesses.

Demonstrating the bug by manually inserting yield()s only proves the 
point - without those 

Message of length 5974 truncated

Up to top level
Build   Contrib   Development   Documentation   Historical   Incoming   Software Bugs   Software Enhancements   Web  

Logged in as guest


The OpenLDAP Issue Tracking System uses a hacked version of JitterBug

______________
© Copyright 2013, OpenLDAP Foundation, info@OpenLDAP.org