[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#5664) Deadlocks when writing in parallell (two processes)



tom.bjorkholm@aastra.com wrote:
> Full_Name: Stelios Grigoriadis & Tom Björkholm
> Version: 2.3.39
> OS: Novell SLES 10
> URL: ftp://ftp.openldap.org/incoming/
> Submission from: (NULL) (194.237.142.7)
>
>
> We get a lot of DB_LOCK_DEADLOCK when using client programs that for a period of
> time continuously writes to OpenLDAP.
> Version is 2.3.39.
>
> The information added is of the form:
> ebcmdCustomer=0+ebcmdDir=220xx,ou=AuthCodes,ebcmdVersion=0,ebcmdProduct=ebcmd,dc=example,dc=com
> where xx varies.
>
> Snippet of the output:
> Mar 27 13:03:21 ldapt1 slapd[7589]: => bdb_dn2id_add: subtree
> (ebcmdCustomer=0+ebcmdDir=22037,ou=authcodes,ebcmdVersion=0,ebcmdProduct=ebcmd,dc=example,dc=com)
> put failed: -30995
> Mar 27 13:03:26 ldapt1 slapd[7589]: => bdb_idl_insert_key: c_put id failed:
> DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock (-30995)
> Mar 27 13:03:26 ldapt1 slapd[7589]: => bdb_dn2id_add: parent
> (ou=authcodes,ebcmdVersion=0,ebcmdProduct=ebcmd,dc=example,dc=com) insert
> failed: -30995
> Mar 27 13:03:28 ldapt1 slapd[7589]: => bdb_idl_insert_key: c_put id failed:
> DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock (-30995)
> Mar 27 13:03:28 ldapt1 slapd[7589]: => bdb_dn2id_add: parent
> (ou=authcodes,ebcmdVersion=0,ebcmdProduct=ebcmd,dc=example,dc=com) insert
> failed: -30995
> Mar 27 13:03:36 ldapt1 slapd[7589]: => bdb_idl_insert_key: c_put id failed:
> DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock (-30995)
> Mar 27 13:03:36 ldapt1 slapd[7589]: => bdb_dn2id_add: parent
> (ou=authcodes,ebcmdVersion=0,ebcmdProduct=ebcmd,dc=example,dc=com) insert
> failed: -30995
> Mar 27 13:03:38 ldapt1 slapd[7589]: => bdb_idl_insert_key: c_put id failed:
> DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock (-30995) 
>
>
>   

We've temporarily fixed the problem by introducing a static mutex before 
any add/update operation.
By doing so, we have effectively serialized the add/update operations 
from within slapd. This is just
intended as a temporary solution as we hope the issue will be resolved 
in future releases.

The patch:

# This patch file is derived from OpenLDAP Software. All of the 
modifications to OpenLDAP Software
# represented in the following patch(es) were developed by Stelios 
Grigoriadis stelios.xx.grigoriadis@ericsson.com.
# These modifications are not subject to any license of Ericsson AB.

# I, Stelios Grigoriadis, hereby place the following modifications to 
OpenLDAP Software (and only these modifications)
# into the public domain. Hence, these modifications may be freely used 
and/or redistributed for any purpose with or
# without attribution and/or other notice.

# Bug Fix - This patch fixes the bug ITS#5133.
# The fix works as follows. A periodic check in the runqueue (called 
do_mastercheck). The intervall is determined by
# a slapd.conf parameter (mastercheckint) in the syncrepl section and is 
optional. If it's not specified, it's not
# inserted in the runqueue.

--- servers/slapd/connection.c    2007-06-15 01:49:38.000000000 +0200
+++ connection.c    2008-06-10 16:30:08.000000000 +0200
@@ -1052,19 +1052,24 @@
     /* FIXME: returns 0 in case of failure */
     ldap_pvt_mp_add_ulong(slap_counters.sc_ops_initiated, 1);
     ldap_pvt_thread_mutex_unlock( &slap_counters.sc_ops_mutex );
+    static pthread_mutex_t op_upd_mutex = PTHREAD_MUTEX_INITIALIZER;
 
+    int upd_tag=0;
     op->o_threadctx = ctx;
 #ifdef LDAP_DEVEL
     op->o_tid = ldap_pvt_thread_pool_tid( ctx );
 #endif /* LDAP_DEVEL */
 
     switch ( tag ) {
-    case LDAP_REQ_BIND:
-    case LDAP_REQ_UNBIND:
     case LDAP_REQ_ADD:
     case LDAP_REQ_DELETE:
     case LDAP_REQ_MODDN:
     case LDAP_REQ_MODIFY:
+        ldap_pvt_thread_mutex_lock( &op_upd_mutex );
+        upd_tag=1;
+        break;
+    case LDAP_REQ_BIND:
+    case LDAP_REQ_UNBIND:
     case LDAP_REQ_COMPARE:
     case LDAP_REQ_SEARCH:
     case LDAP_REQ_ABANDON:
@@ -1178,6 +1183,10 @@
 
     connection_resched( conn );
     ldap_pvt_thread_mutex_unlock( &conn->c_mutex );
+    if (upd_tag) {
+        ldap_pvt_thread_mutex_unlock( &op_upd_mutex );
+    }
+
     return NULL;
 }