[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: zero copy jni lib for lmdb



Kristoffer SjÃgren wrote:
Hi

I'm experimenting a bit with developing a JNI library for LMDB that utilize
zero copy buffers. As you may know there is a project called lmdbjni [1]
already but it does buffer copy only.

I must admit that my C and JNI skills is not really that great, so please
forgive any stupidity on my part.

Can't offer much advice on java or JNI. But your test doesn't really leverage LMDB's zero-copy writes. To do that you have to use the MDB_RESERVE write flag before generating the output value. Otherwise there's still a copy from the mdb_put argument into the actual DB.

I don't know what other buffer copies occur before you finally reach mdb_put, but I don't see why you need to do anything special to pass a user value into mdb_put. zero-copy really only has significant benefit for readers, and that's where you have to play games with non-GC'd pointers.

Perhaps other java users on this list can offer more advice.

Anyway, i'm taking the sun.misc.Unsafe (no bounds checking) approach for
memory allocation and pass memory addresses through JNI to LMDB for both
writes and reads and at the moment I have implemented put, get, begin, commit [2].

I suspect that something is wrong because the performance between the buffer
copy and zero copy isn't really that big. lmdbjni is even faster sometimes,
write commits specifically is almost twice as fast.

The test write 1M entries with a 4 bytes key (1-1M) and 128 bytes value
(random) committed in one go. LMDB is configured identically both
implementations with 4GB MDB_WRITEMAP. I run Linux 3.2.0 with Intel(R)
Core(TM)2 Quad CPU Q6600  @ 2.40GHz.

JProfiler show no signs of bottlenecks in Java in my implementation, most time
is spent on the native methods put and commit. The opposite is true for
lmdbjni where most time is spent creating and writing Java byte buffers, while
native put and commit is just a fraction of that time.

Not sure exactly the significance of the gcc compiler but here is how I do it.

gcc -g -Wall -O2 -Wbad-function-cast -Wno-write-strings -fPIC -shared
-I$JAVA_HOME/include -I$JAVA_HOME/include/linux -Isrc/main/native
src/main/native/jlmdb.c src/main/native/liblmdb.a -o target/linux64/libjlmdb.so

- What could be the reason for "slow(er)" commits?
- How much faster can I expect properly implemented zero copying to be?
- Maybe lmdbjni have defacto standard optimizations that me as a C noobie
might have overlooked?
- Are there any performance counters, tracepoints or similar that might be of
interest to find where latency is spent?

Greatful for any tips or pointers on how to track the problem down.

Cheers,
-Kristoffer

[1] https://github.com/chirino/lmdbjni

[2] JNI

MDB_env *mdb_env;
MDB_dbi dbi;

JNIEXPORT jlong JNICALL Java_NativeLmdb_put (JNIEnv * env, jobject obj, jlong
tx, jlong keyAddress, jlong keySize, jlong valAddress, jlong valSize) {
     MDB_val mdb_key, mdb_val;

     mdb_key.mv_data = (void *)(intptr_t) keyAddress;
     mdb_key.mv_size = (size_t) keySize;

     mdb_val.mv_data = (void *)(intptr_t) valAddress;
     mdb_val.mv_size = (size_t) valSize;
     int rc = mdb_put((MDB_txn *) (intptr_t) tx, dbi, &mdb_key, &mdb_val, 0);
     return rc;
}

JNIEXPORT jlong JNICALL Java_NativeLmdb_get (JNIEnv *env, jobject o, jlong tx,
jlong a, jlong s) {
     MDB_val mdb_key, mdb_val;
     mdb_key.mv_data = (void *)(intptr_t) a;
     mdb_key.mv_size = (size_t) s;
     int rc = mdb_get((MDB_txn *) (intptr_t) tx, dbi, &mdb_key, &mdb_val);
     if (rc == 0) {
         return (intptr_t) mdb_val.mv_data;
     }
     return -1;
}

JNIEXPORT void JNICALL
Java_org_deephacks_lmdb_NativeLmdb_mdb_1txn_1begin(JNIEnv *env, jobject obj,
jlongArray array) {
      jlong *nArray = (*env)->GetLongArrayElements(env, array, NULL);
      MDB_txn *txn;
      mdb_txn_begin(mdb_env, NULL, 0, &txn);
      nArray[0] = (jlong) txn;
      (*env)->ReleaseLongArrayElements(env, array, nArray, 0);
}

JNIEXPORT void JNICALL Java_NativeLmdb_mdb_1txn_1begin(JNIEnv *env, jobject
obj, jlongArray tx) {
      jlong *nArray = (*env)->GetLongArrayElements(env, array, NULL);
      MDB_txn *txn;
      mdb_txn_begin(mdb_env, NULL, 0, &txn);
      tx[0] = (jlong) txn;
      (*env)->ReleaseLongArrayElements(env, tx, nArray, 0);
}

JNIEXPORT jint JNICALL Java_NativeLmdb_mdb_1txn_1commit(JNIEnv *env, jobject
obj, jlong tx) {
     return (jint)mdb_txn_commit((MDB_txn *)(intptr_t)tx);
}

JNIEXPORT void JNICALL Java_NativeLmdb_open (JNIEnv * env, jobject obj) {
     mdb_env_create(&mdb_env);
     mdb_env_set_mapsize(mdb_env, 4294967296);
     mdb_env_open(mdb_env, "/tmp/testdb",  MDB_WRITEMAP, 0664);
     MDB_txn *txn;
     mdb_txn_begin(mdb_env, NULL, 0, &txn);
     mdb_open(txn, NULL, 0, &dbi);
     mdb_txn_commit(txn);
}





--
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/