[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: LMDB sequential read from a ">>RAM" database caffe use case



Brian wrote:
Hi, all

I am a caffe user. In my use case, I am reading from a ~300GB lmdb
sequentially, reading one element and never accessing again until I read
every other element in the db and loops around again. It seems that lmdb
will page cache every element. This becomes a problem as the dataset is
read in fairly fast and it takes 1 hour before nearly 60% of the RAM is
devoted to my lmdb page caches. Then it runs out of unmapped memory to
use, so it starts kicking out the page frames of processes of other
users, many of which have not been accessed in the last hour. So the
system will prefer to kick out those page frames instead of page frames
mapped at the beginning of the run. This behavior is entirely
understandable but it causes extremely severe thrashing and unresponsive
system, as those page frames of other user's comes into use very soon.
Does my diagnosis of the situation seem reasonable?

If you're on Linux, you need to set /proc/sys/vm/swappiness to zero.

As many of our caffe use case is a similar sequential read of a dataset
much greater than available RAM, many other caffe users besides me
reports a similar issue. They all report their system becoming
unresponsive, presumably due to the same thrashing.

"training is freezing for multiple hours"
https://github.com/BVLC/caffe/issues/1412

Nothing here seems relevant to LMDB.

"Caffe memory increases with time(iterations?)"
https://github.com/BVLC/caffe/issues/1377

The poster says his problem occurred with both LMDB and LevelDB.

"Random freezes"
https://github.com/BVLC/caffe/issues/1807

Comment says the same problem occurred with LevelDB.

All of those issues appear to be Caffe-specific, not LMDB-specific.

Is there some option that I missed that can inform lmdb that for a
certain read-only transaction is going to be purely sequential, so it
shouldn't bother to cache the already read elements? If not, is there a
plan to include such a feature?

Not at present.

Or is there an option I can limit the maximum memory a single lmdb
transaction is going to use to cache?

No, nor will there ever be.

Or is there some other possible solution to this problem?

Read up on how to tune your OS's memory subsystem. There will never be any cache-tuning options in LMDB itself. LMDB relies entirely on the OS cache and it's your responsibility to know how to configure your OS.

-----

I have been using a hack based on this fork

https://github.com/raingo/lmdb-fork/commit/091ff5e8be35c2f2336e37c0db4c392fa9c0bdcf

to avoid this issue. However, I would love to know if there is any less
hacky way to solve this problem.

That's ridiculously bad. Using MAP_PRIVATE means LMDB pages will be backed by swap space - this will consume double the resources that it would normally use. There's a reason we only use MAP_SHARED.


-----

I have seen this thread

http://www.openldap.org/lists/openldap-devel/201502/msg00052.html

but it looks like it is for multiple readers.

It would apply to the single-reader case as well. The main point of that thread is to tell the kernel to use a larger readahead value. Again, the free-behind behavior is automatic if your OS is properly tuned, and the kernel's default read-ahead is already 64KB so I don't see much benefit there.

-----

Any advice on this will be much appreciated!


--
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/