[Date Prev][Date Next] [Chronological] [Thread] [Top]

slapd hangs at 100% cpu in sched_yield (ITS#2030)



Full_Name: Steven Wilton
Version: 2.1.3
OS: Linux (Debian 3.0)
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (203.24.100.137)


This is actually a fix for the problem where slapd hangs using 100% CPU load
with the busy thread doing sched_yield() continuously.  I ran slapd with the "-d
1" flag, and came up with the following:

=> bdb_back_search
bdb(o=EFTEL): Lock table is out of available locker entries
bdb_dn2entry_rw("o=eftel")
=> bdb_dn2id_matched( "o=eftel" )
====> bdb_cache_find_entry_dn2id("o=eftel"): 1 (1 tries)
bdb(o=EFTEL): Locker does not exist
====> bdb_cache_find_entry_id( 1 ): 1 (busy) 2
locker = 1429
bdb(o=EFTEL): Locker does not exist
====> bdb_cache_find_entry_id( 1 ): 1 (busy) 2
locker = 1429


The last 3 lines then continue endlessly.  The problem is that the locker is not
allocated correctly in the first place, due to the load on the server.  I came
up with the following patch, which seems to have fixed the problem (I can't get
slapd to hang any more under the same load).  I had thought about inserting a
small (100ms) sleep before the sched_yield, but am not sure what the most
portable way of doing this is.

--- openldap-2.1.3/servers/slapd/back-bdb/back-bdb.h.orig       Mon Aug 19
13:01:27 2002
+++ openldap-2.1.3/servers/slapd/back-bdb/back-bdb.h    Mon Aug 19 13:01:56
2002
@@ -153,7 +153,7 @@
 #define TXN_COMMIT(txn,f)                      txn_commit((txn), (f))
 #define        TXN_ABORT(txn)                          txn_abort((txn))
 #define TXN_ID(txn)                                    txn_id(txn)
-#define LOCK_ID(env, locker)           lock_id(env, locker)
+#define LOCK_ID(env, locker)           while(lock_id(env, locker))
{ldap_pvt_thread_yield();}
 #define LOCK_ID_FREE(env, locker)      lock_id_free(env, locker)
 #else
 #define LOCK_DETECT(env,f,t,a)         (env)->lock_detect(env, f, t, a)
@@ -165,7 +165,7 @@
 #define TXN_COMMIT(txn,f)                      (txn)->commit((txn), (f))
 #define TXN_ABORT(txn)                         (txn)->abort((txn))
 #define TXN_ID(txn)                                    (txn)->id(txn)
-#define LOCK_ID(env, locker)           (env)->lock_id(env, locker)
+#define LOCK_ID(env, locker)           while((env)->lock_id(env, locker))
{ldap_pvt_thread_yield();}
 #define LOCK_ID_FREE(env, locker)      (env)->lock_id_free(env, locker)
 #endif