[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: [LMDB] What pointer is returned with combination of MDB_WRITEMAP and MDB_RESERVE?



By accident. Note that LMDB 1.0 supports page checksums, and what you're doing here
will break those checksums.

Will the checksums be mandatory or optional? For InMemory/NoSync case they could add overhead and you already have many options to choose safety vs performance (WRITEMAP, NOMETASYNC, etc).

In-place updates in write transactions are 2x faster and work from several threads, it's just a lock and cumulative performance over N threads behaves just as with a shared lock.

For some reason, in-place updates from read transactions work for NoSync case (with 1M 16-bytes values in tests) and give another 2x performance. Yet I understand the ludicrosity of such attempts.

Basically you're abusing MDB_WRITEMAP, whose only purpose is to (potentially) optimize
normal write transactions. I've no interest in even speculating on intentional misuse
of the API.

I successfully use the overflow-pages-do-not-move hack and already rely on it (since it was confirmed in this thread that at least for a particular version large pages do not move by design). LMDB acts like a shared memory allocator/pool, to move buffers off-heap and make them accessible from different processes and containers. Checksums will break this case as well, but I'm fine staying with 0.9x because it's too cool to have 500G NVM fast granular random-access memory pool with only 16G RAM on the smallest AWS i3 instance and similar cases. But still having checksums optional would be nice, I already use ones at application level.


On Thu, Aug 16, 2018 at 7:45 PM, Howard Chu <hyc@symas.com> wrote:
Victor Baybekov wrote:
Hello,

Working with overflow pages directly via pointers outside write transactions works great and it helps that they do not move "by design" in current versions as discussed in this thread.

I have two related scenarios that will give a substantial performance boost in my case.

/The first one/ is updating a value in-place via pointer from aborted write transaction. If I

1) use MDB_WRITEMAP,
2) from **write** transaction find a record (which is small and not in an overflow page),
3) modify a part of it's value (for duplicates this part is not used in the compare function) directly via the MDB_VAL data pointer (e.g. interlocked_increment or compare_and_swap),
4) and **abort** the transaction,

then readers see the updated value via normal read transactions later. Since I do the direct updates from inside a write transaction all other writers should be locked until I exit the transaction (abort in this case), and no pages should move since the transaction is aborted. Is this correct? Does this work "by design" or "by accident" currently?

By accident. Note that LMDB 1.0 supports page checksums, and what you're doing here
will break those checksums.

/The second one/ is about updating values in-place from read transactions. If I

Updating any value in a read transaction is ludicrous.

Basically you're abusing MDB_WRITEMAP, whose only purpose is to (potentially) optimize
normal write transactions. I've no interest in even speculating on intentional misuse
of the API.


--
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/