Issue 7974 - LDBM's "laggard reader" flaw still present, in continue of ITS#7904
Summary: LDBM's "laggard reader" flaw still present, in continue of ITS#7904
Status: UNCONFIRMED
Alias: None
Product: LMDB
Classification: Unclassified
Component: liblmdb (show other issues)
Version: unspecified
Hardware: All All
: --- normal
Target Milestone: ---
Assignee: OpenLDAP project
URL:
Keywords:
: 7830 (view as issue list)
Depends on:
Blocks:
 
Reported: 2014-10-23 04:24 UTC by Leonid Yuriev
Modified: 2020-03-20 18:45 UTC (History)
1 user (show)

See Also:


Attachments
0001-lmdb-ITS-7974-oomkiller-feature.patch (4.89 KB, patch)
2014-10-23 05:13 UTC, Leonid Yuriev
Details
0002-slapd-ITS-7974-oomkiller-feature.patch (3.92 KB, patch)
2014-10-23 05:13 UTC, Leonid Yuriev
Details
0001-lmdb-ITS-7974-a-reading-lag-for-dreamcatcher.patch (2.08 KB, patch)
2014-10-23 05:26 UTC, Leonid Yuriev
Details
0002-slapd-ITS-7974-dreamcatcher-feature.patch (5.41 KB, patch)
2014-10-23 05:26 UTC, Leonid Yuriev
Details

Note You need to log in before you can comment on or make changes to this issue.
Description Leonid Yuriev 2014-10-23 04:24:12 UTC
Full_Name: Leonid Yuriev
Version: 2.4.40
OS: RHEL7
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (31.130.36.33)


Currently there is flaw that does not allow using OpenLDAP + LMDB in projects
with high rate of updates (add/modify/delete). The root of these problems is
that LMDB cannot reclaim freed pages by a presence of a "laggard reader", or in
other words if they are still referenced by an active read.

It should be noted, that withholding of reclaiming while the high update rate,
burns free pages very quickly. Fix of the ITS#7904 significantly improves the
situation, but does not solve all the problems completely.

Firstly, seemingly innocuous use of something like a "mdb_stat -efff | less" can
lead to the MDB_MAP_FULL and paralyze update.

Second, ITS#7904 affects the syncrepl only partially. Approximately half of the
"long read" operations occur without sending data to the network. Therefore, in
many cases get MDB_MAP_FULL easily enough. This leads to a chain of problems and
in some cases makes the replication impossible.

To solve these problems, I made two simple improvements.

1) OOMKiller feature � just a fuse likely Linux kernel oomkiller.

In generally, in case of MDB_MAP_FULL will send the SIGKILL to a �laggard
reader�, but not to self. On success will retry to reclaim and continue. Engaged
by �envflags oomkill�.

2) Dreamcatcher feature � really, it has caught and forced vanish our nightmares
with syncrepl & MDB MAP_FULL ;)

Based on ITS#7904 fix. In generally, renew read-txt when the lag from last txn
is greater than a configured threshold and the percentage of pages allocated is
greater than the configured value. Engaged by �dreamcatcher lag percentage�.

Two patchsets will be attached soon.
Comment 1 Leonid Yuriev 2014-10-23 05:13:21 UTC
The attached files is derived from OpenLDAP Software. All of the 
modifications
to OpenLDAP Software represented in the following patch(es) were 
developed by
Peter-Service LLC, Moscow, Russia. Peter-Service LLC has not assigned rights
and/or interest in this work to any party. I, Leonid Yuriev am authorized by
Peter-Service LLC, my employer, to release this work under the following 
terms.

Peter-Service LLC hereby places the following modifications to OpenLDAP 
Software
(and only these modifications) into the public domain. Hence, these
modifications may be freely used and/or redistributed for any purpose 
with or
without attribution and/or other notice.

Comment 2 Leonid Yuriev 2014-10-23 05:26:05 UTC
The attached files is derived from OpenLDAP Software. All of the 
modifications
to OpenLDAP Software represented in the following patch(es) were 
developed by
Peter-Service LLC, Moscow, Russia. Peter-Service LLC has not assigned 
rights
and/or interest in this work to any party. I, Leonid Yuriev am 
authorized by
Peter-Service LLC, my employer, to release this work under the following 
terms.

Peter-Service LLC hereby places the following modifications to OpenLDAP 
Software
(and only these modifications) into the public domain. Hence, these
modifications may be freely used and/or redistributed for any purpose 
with or
without attribution and/or other notice.


Comment 3 Leonid Yuriev 2014-10-23 05:42:26 UTC
I assume ITS#7830 is the same issue.

Comment 4 Hallvard Furuseth 2014-12-02 11:20:04 UTC
On 10/23/2014 07:13 AM, leo@yuriev.ru wrote:
> Subject: [PATCH 1/2] lmdb: ITS#7974 oomkiller feature.
> (...)
> +typedef int (MDB_oomkiller_func)(MDB_env *env, int pid, void* thread_id, size_t txn);

Some thoughts about this:

Instead of trusting the return value, it seems safer to re-check
with mdb_reader_pid().  Like mdb_reader_check0() does.  Maybe
except on Windows, where file locks from dead processes may
linger for a while until the OS reclaims them.

Don't call it OOMkiller just because that's how you use it.
Others might do something else, like sending a reader a signal
which it interprets as "please wake up and finish your txn".
Or it might decide this process is the one which should give up.

This feature could make it interesting to let readers and writers
tell each other things: Reserve some unused space in the reader
table slots for stuff the reader's caller could put there, and
some space for an impatient writer to leave a note.  Could go
in an independent commit if there is any demand for it though.

-- 
Hallvard

Comment 5 Leonid Yuriev 2014-12-04 10:31:06 UTC
Hallvard, thank for your comments.

2014-12-02 14:20 GMT+03:00 Hallvard Breien Furuseth <h.b.furuseth@usit.uio.no>:
> On 10/23/2014 07:13 AM, leo@yuriev.ru wrote:
>>
>> Subject: [PATCH 1/2] lmdb: ITS#7974 oomkiller feature.
>> (...)
>> +typedef int (MDB_oomkiller_func)(MDB_env *env, int pid, void* thread_id,
>> size_t txn);
>
>
> Some thoughts about this:
>
> Instead of trusting the return value, it seems safer to re-check
> with mdb_reader_pid().  Like mdb_reader_check0() does.  Maybe
> except on Windows, where file locks from dead processes may
> linger for a while until the OS reclaims them.

I agree that usign mdb_reader_pid() is a better way.

> Don't call it OOMkiller just because that's how you use it.
> Others might do something else, like sending a reader a signal
> which it interprets as "please wake up and finish your txn".
> Or it might decide this process is the one which should give up.

Could you suggest something other instead of "oomkiller"?
Be noted, the "dreamcatcher" feature has a critical bug, which I has
found and made fix while work on ITS#7968 & ITS#7987.
Currently we hard testing a new code.
So, in a week I plan to update both of the patches.

> This feature could make it interesting to let readers and writers
> tell each other things: Reserve some unused space in the reader
> table slots for stuff the reader's caller could put there, and
> some space for an impatient writer to leave a note.  Could go
> in an independent commit if there is any demand for it though.

Communications between readers and writers may be interesting, but I
think it is over-engineering in the LMDB context.
IMHO the LMDB's code has a lot of technical debt, so it is more
usefull to re-implement all of from a scratch, under a rules of
perfectly-clean codestyle.
May be I will do this, but on a basis and after a release of 1Hippeus
- it is a extreme performance engine for zero-copy mesaging in a
shared memory, partially like Intel DPDK.

Leonid.

Comment 6 Hallvard Furuseth 2014-12-04 16:03:32 UTC
On 12/04/2014 11:31 AM, leo@yuriev.ru wrote:
> Could you suggest something other instead of "oomkiller"?

Don't have a particularly good idea.  oom_func, maybe.

>> This feature could make it interesting to let readers and writers
>> tell each other things: Reserve some unused space in the reader
>> table slots for stuff the reader's caller could put there, and
>> some space for an impatient writer to leave a note.  Could go
>> in an independent commit if there is any demand for it though.
>
> Communications between readers and writers may be interesting, but I
> think it is over-engineering in the LMDB context.

Yes... I guess I was thinking mostly of the prototype, in case we
want to add something like it later.  Might be useful to add a void*
argument which would be NULL now but could be used later, if needed.

-- 
Hallvard

Comment 7 Quanah Gibson-Mount 2020-03-20 18:44:14 UTC
*** Issue 7830 has been marked as a duplicate of this issue. ***