[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#7841) high disk utilization



leo@yuriev.ru wrote:
> The attached patch file is derived from OpenLDAP Software. All of the
> modifications to OpenLDAP Software represented in the following
> patch(es) were developed by Leonid Yuriev <leo@yuriev.ru>. I have not
> assigned rights and/or interest in this work to any party.
>
> The attached modifications to OpenLDAP Software are subject to the
> following notice:
>
> Copyright 2014 Leonid Yuriev.
> Copyright 2014 Peter-Service LLC, Moscow, Russia.
> Redistribution and use in source and binary forms, with or without
> modification, are permitted only as authorized by the OpenLDAP Public
> License.
>
> https://github.com/leo-yuriev/openldap-lmdb-challenge/pull/1
> or
> https://github.com/leo-yuriev/openldap-lmdb-challenge/ branch master-devel
>
> commit 841059330fd44769e93eb4b937c3ce42654fad6f
> Author: Leo Yuriev <leo@yuriev.ru>
> Date:   2014-09-20 07:16:15 +0400
>
>       BUGFIX - lmdb: lock meta-pages in writemap-mode to avoid unexpected write,
>                 before the data pages would be synchronized.
>
>       Without locking the meta-pages may be writen by OS before other data,
>       in this case database would be inconsistent.

Seems unnecessary. Won't happen by default; could happen with MDB_NOSYNC but 
that risk is already documented.
>
> commit 6240c3350e8bd86337c7e41722cf6a38881f15e7
> Author: Leo Yuriev <leo@yuriev.ru>
> Date:   2014-09-12 01:32:13 +0400
>
>       BUGFIX - lmdb: reordering of instructions which update the txn in
> a meta-page.
>
>       Without "volatile" or memory-barrier compiler may reorder instructions
>       for update the "mm_txnid" field in meta-page in "writemap" mode.
>
>       From the reader's point of view this cause a short
>       time interval when the transaction is corrupted.

OK.
>
> commit accef62de7fe5660f870f4c5da319a2a8098b2fb
> Author: Leo Yuriev <leo@yuriev.ru>
> Date:   2014-09-21 02:29:50 +0400
>
>       BUGFIX - lmdb: 'volatile' to important fields, which
>                 may be updated by readers asynchronously.
>
>       Without 'volatile' compiler may eliminate a mdb_find_oldest() calls.

OK.
>
> commit bb83e03cf1b8bceee64550229c3becbdd5400680
> Author: Leo Yuriev <leo@yuriev.ru>
> Date:   2014-09-19 20:18:17 +0400
>
>       FEATURE - lmdb-backend: support config for 'lifo' and 'coalesce' envflags.
>
> commit 0c168d0e63ed78d13df3fc8a42f3667335678639
> Author: Leo Yuriev <leo@yuriev.ru>
> Date:   2014-09-20 10:13:28 +0400
>
>       FEATURE - lmdb: MDB_LIFORECLAIM & MDB_COALESCE modes.
>
>       Reclaim FreeDB in LIFO order - this is a main feature.
>       Also aim to coalesce small FreeDFB records.

Will spend more time looking at this closer.
>
> commit 8ddd63161aeb2689822d1a8d27385d62e4e341ae
> Author: Leo Yuriev <leo@yuriev.ru>
> Date:   2014-09-19 22:47:19 +0400
>
>       BUGFIX - lmdb: properly sync meta-pages in mdb_sync_env().
>
>       Meta-pages may be updated during data-syncing in mdb_sync_env(),
>       in this case database would be inconsistent.
>
>       Check-and-retry if lead txn-id changed during flushing data in
> mdb_sync_env().

Probably could simplify this, just obtain the write mutex unconditionally, 
then there's no need to loop or retry. But also, this depends on MDB_NOLOCK - 
if that's set, then do no locking at all.

> commit 908677f989588d06b9f00620576dea3c5c8675d7
> Author: Leo Yuriev <leo@yuriev.ru>
> Date:   2014-09-04 16:10:05 +0400
>
>       FEATURE - lmdb-backend: support for "checkpoint kbytes" config-option.

OK if the lmdb implementation is OK.
>
> commit 147f41a8110f28456bc32123bde86d47183f9c0a
> Author: Leo Yuriev <leo@yuriev.ru>
> Date:   2014-09-04 16:01:15 +0400
>
>       FEATURE - lmdb: implementation of "checkpoint kbytes".
>
>       Force flush when volume of the changes reached a configurable threshold.

Probably OK. Needs some typographical cleanup. Not sure "syncbytes" is a good 
name.
>
> commit fb82a0b688f4c31313d0790415feda8aaa18651c
> Author: Leo Yuriev <leo@yuriev.ru>
> Date:   2014-09-04 15:18:16 +0400
>
>       CHANGE - lmdb-backend: checkpoint-interval in seconds instead of minutes.

Gratuitous change. We used minutes since the BDB backend uses minutes, and the 
intention was to maintain parallel functionality. What's the justification for 
this change?
>
> commit fc409d89e0d9dde20f612e34c2a463c8a81ea000
> Author: Leo Yuriev <leo@yuriev.ru>
> Date:   2014-09-20 06:51:04 +0400
>
>       EXTENSION - lmdb: more usefull info from mdb_stat tool.

A bit ambiguous. me_tail_txnid is actually the ID of the oldest reader, not 
the "last" reader. I'm not convinced of the value of this patch, since you can 
already view the readers list.

> commit ccc7da690ffbff440643295b945fdf7886f48c97
> Author: Leo Yuriev <leo@yuriev.ru>
> Date:   2014-09-05 00:19:16 +0400
>
>       TRIVIA - lmdb: clean testdb-dir while "make test".

OK.


-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/