[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: PATCH for openldap-2.3.1alpha: fix slapd hangs problem under syncrepl



While you are, of course, welcomed to discuss patches on
appropriate lists before submission, please note that
only patches submitted through the Issue Tracking System
will be considered for inclusion.  See
http://www.openldap.org/its/ and http://www.openldap.org/devel/contributing.html for details.

Thanks, Kurt

At 09:36 AM 3/11/2005, Hai Zhao wrote:
>  I am not familiar with openldap source, and I hope I describle the
>problem & my patch clearly, ;)
>
>Problem: 
>  I use openldap-2.3.1alpha on two machines, one as master provider
>and the other as syncrepl consumer. The syncrepl type is
>refreshAndPersist. The provider slapd runs at
>ldap://192.168.0.218:9014/. The db backend is bdb.
>  When the connection between the consumer and provider was
>established, I tried to use ldappasswd to modify one user's password.
>But ldappasswd hungs with no response. And the slapd process on the
>provider used 100% CPU.
>  
>Cause:
>  I read through the code and found that one thread of slapd looped at
>servers/slapd/overlays/syncprov.c:line 1467-1471, in function
>syncprov_op_mod(), which made the slapd service unavailible.
>  
>                                                                mt = avl_find( si->si_mods, &mtdummy, sp_avl_cmp );
>                if ( mt ) {
>                        ldap_pvt_thread_mutex_lock( &mt->mt_mutex );
>                        ldap_pvt_thread_mutex_unlock( &si->si_mods_mutex );
>                        mt->mt_tail->mi_next = mi;
>                        mt->mt_tail = mi;
>                        /* wait for this op to get to head of list */
>line 1467:              while ( mt->mt_mods != mi ) {
>                                ldap_pvt_thread_mutex_unlock( &mt->mt_mutex );
>                                ldap_pvt_thread_yield();
>                                ldap_pvt_thread_mutex_lock( &mt->mt_mutex );
>line 1471:              }
>                        ldap_pvt_thread_mutex_unlock( &mt->mt_mutex );
>                } else {  
>  
>  The problem exists in over_op_func() in servers/slapd/backover.c,
>called from passwd_extop() in servers/slapd/passwd.c:line 163. When
>the operation is ldappassword, backend extensions were given a chance
>to handle the operation themself. But bdb backend doesn't handle
>passwd modification, so passwd_extop() will finally do it. In this
>procedure, syncprov_op_mod() was called twice, one by over_op_func(),
>and one by passwd_extop():line 221, the following line:
>                                rs->sr_err = op2.o_bd->be_modify( &op2, rs );
>  So, in two calls of syncprov_op_mod(), two 'mi' were appended to
>si->si_mods, which caused the second time to loop at line 1467-1471 in
>servers/slapd/overlays/syncprov.c.
>
>My patch:
>  The 'mi' added the first time should be cleanup because the callback
>function -- syncprov_op_cleanup() -- were never called. The reason is
>in over_op_func() in servers/slapd/backover.c.
>  
>        cb.sc_next = op->o_callback;
>        cb.sc_private = oi;
>        op->o_callback = &cb;
>
>        for (; on; on=on->on_next ) {
>                func = &on->on_bi.bi_op_bind;
>                if ( func[which] ) {
>                        op->o_bd->bd_info = (BackendInfo *)on;
>                        rc = func[which]( op, rs );
>                        if ( rc != SLAP_CB_CONTINUE ) break;
>                }
>        }
>        
>  NOTICE: at this point, op->o_callback = syncprov_op_cleanup.
>
>        func = &oi->oi_orig->bi_op_bind;
>        if ( func[which] && rc == SLAP_CB_CONTINUE ) {
>                op->o_bd->bd_info = oi->oi_orig;
>                rc = func[which]( op, rs );
>        }
>        /* should not fall thru this far without anything happening... */
>        if ( rc == SLAP_CB_CONTINUE ) {
>                rc = op_rc[ which ];
>        }
>        op->o_bd = be;
>        op->o_callback = cb.sc_next;
>
>  NOTICE: syncprov_op_cleanup was gone here!!!!!
>  
>    So I copy some code from servers/slapd/result.c to call the
>callback functions. Below is my patch. It works in my environment.
>
>----------------------------------------------------------------
>    
>diff -aur openldap-2.3.1alpha.orig/servers/slapd/backover.c
>openldap-2.3.1alpha/servers/slapd/backover.c
>--- openldap-2.3.1alpha.orig/servers/slapd/backover.c   2005-02-03
>01:32:43.000000000 +0800
>+++ openldap-2.3.1alpha/servers/slapd/backover.c        2005-03-11
>22:09:32.000000000 +0800
>@@ -296,6 +296,27 @@
>        if ( rc == SLAP_CB_CONTINUE ) {
>                rc = op_rc[ which ];
>        }
>+
>+       // ADD by Wayne Zhao
>+       if ( op->o_callback ) {
>+               int             first = 1;
>+               slap_callback   *sc = op->o_callback,
>+                               *sc_next = op->o_callback;
>+
>+               for ( sc_next = op->o_callback; sc_next;
>op->o_callback = sc_next) {
>+                       sc_next = op->o_callback->sc_next;
>+                       if ( op->o_callback->sc_cleanup ) {
>+                               (void)op->o_callback->sc_cleanup( op, rs );
>+                       }
>+                       if ( first && op->o_callback == NULL ) {
>+                               sc = NULL;
>+                       }
>+                       first = 0;
>+               }
>+               op->o_callback = sc;
>+       }
>+       // END of ADD by Wayne Zhao
>+       
>        op->o_bd = be;
>        op->o_callback = cb.sc_next;
>        return rc;
>
>----------------------------------------------------------------
>    
>slapd.conf on the provider side:
>
>include         /usr/local/etc/openldap/schema/core.schema
>loglevel        261
>pidfile         ./var/slapd.pid
>argsfile        ./var/slapd.args
>database        bdb
>suffix          "dc=example,dc=com"
>rootdn          "cn=admin,dc=example,dc=com"
>rootpw          secret
>directory       ./db
>index           objectClass     eq
>overlay         syncprov
>    
>----------------------------------------------------------------
>
>slapd.conf on the provider side:
>
>include         /usr/local/etc/openldap/schema/core.schema
>pidfile         /var/slapd-agent.pid
>argsfile        /var/slapd-agent.args
>
>database        bdb
>suffix          "dc=example,dc=com"
>rootdn          "cn=replica,dc=example,dc=com"
>rootpw          secret
>directory       /var/ldap
>syncrepl rid=1
>                 provider=ldap://192.168.0.218:9014
>                 bindmethod=simple
>                 binddn="cn=admin,dc=example,dc=com"
>                 credentials=secret
>                 searchbase="dc=example,dc=com"
>                 filter="(objectClass=*)"
>                 attrs="*"
>                 schemachecking=off
>                 scope=sub
>                 type=refreshAndPersist
>                 retry=10,+
>updateref        "ldap://192.168.0.218:9014/";
>overlay         syncprov
>
>----------------------------------------------------------------
>
>ldappasswd command line:
># ldappasswd -x -w secret -H ldap://192.168.0.218:9014/ -D
>'cn=admin,dc=example,dc=com' 'uid=test,dc=example,dc=com' -s 123456
>
>----------------------------------------------------------------