[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: PATCH for openldap-2.3.1alpha: fix slapd hangs problem under syncrepl



Oh, thanks for your information. I have submit the patch in ITS. The
issue tracking number is ITS#3596.

On Fri, 11 Mar 2005 09:59:43 -0600, Kurt D. Zeilenga <Kurt@openldap.org> wrote:
> While you are, of course, welcomed to discuss patches on
> appropriate lists before submission, please note that
> only patches submitted through the Issue Tracking System
> will be considered for inclusion.  See
> http://www.openldap.org/its/ and http://www.openldap.org/devel/contributing.html for details.
> 
> Thanks, Kurt
> 
> At 09:36 AM 3/11/2005, Hai Zhao wrote:
> >  I am not familiar with openldap source, and I hope I describle the
> >problem & my patch clearly, ;)
> >
> >Problem:
> >  I use openldap-2.3.1alpha on two machines, one as master provider
> >and the other as syncrepl consumer. The syncrepl type is
> >refreshAndPersist. The provider slapd runs at
> >ldap://192.168.0.218:9014/. The db backend is bdb.
> >  When the connection between the consumer and provider was
> >established, I tried to use ldappasswd to modify one user's password.
> >But ldappasswd hungs with no response. And the slapd process on the
> >provider used 100% CPU.
> >
> >Cause:
> >  I read through the code and found that one thread of slapd looped at
> >servers/slapd/overlays/syncprov.c:line 1467-1471, in function
> >syncprov_op_mod(), which made the slapd service unavailible.
> >
> >                                                                mt = avl_find( si->si_mods, &mtdummy, sp_avl_cmp );
> >                if ( mt ) {
> >                        ldap_pvt_thread_mutex_lock( &mt->mt_mutex );
> >                        ldap_pvt_thread_mutex_unlock( &si->si_mods_mutex );
> >                        mt->mt_tail->mi_next = mi;
> >                        mt->mt_tail = mi;
> >                        /* wait for this op to get to head of list */
> >line 1467:              while ( mt->mt_mods != mi ) {
> >                                ldap_pvt_thread_mutex_unlock( &mt->mt_mutex );
> >                                ldap_pvt_thread_yield();
> >                                ldap_pvt_thread_mutex_lock( &mt->mt_mutex );
> >line 1471:              }
> >                        ldap_pvt_thread_mutex_unlock( &mt->mt_mutex );
> >                } else {
> >
> >  The problem exists in over_op_func() in servers/slapd/backover.c,
> >called from passwd_extop() in servers/slapd/passwd.c:line 163. When
> >the operation is ldappassword, backend extensions were given a chance
> >to handle the operation themself. But bdb backend doesn't handle
> >passwd modification, so passwd_extop() will finally do it. In this
> >procedure, syncprov_op_mod() was called twice, one by over_op_func(),
> >and one by passwd_extop():line 221, the following line:
> >                                rs->sr_err = op2.o_bd->be_modify( &op2, rs );
> >  So, in two calls of syncprov_op_mod(), two 'mi' were appended to
> >si->si_mods, which caused the second time to loop at line 1467-1471 in
> >servers/slapd/overlays/syncprov.c.
> >
> >My patch:
> >  The 'mi' added the first time should be cleanup because the callback
> >function -- syncprov_op_cleanup() -- were never called. The reason is
> >in over_op_func() in servers/slapd/backover.c.
> >
> >        cb.sc_next = op->o_callback;
> >        cb.sc_private = oi;
> >        op->o_callback = &cb;
> >
> >        for (; on; on=on->on_next ) {
> >                func = &on->on_bi.bi_op_bind;
> >                if ( func[which] ) {
> >                        op->o_bd->bd_info = (BackendInfo *)on;
> >                        rc = func[which]( op, rs );
> >                        if ( rc != SLAP_CB_CONTINUE ) break;
> >                }
> >        }
> >
> >  NOTICE: at this point, op->o_callback = syncprov_op_cleanup.
> >
> >        func = &oi->oi_orig->bi_op_bind;
> >        if ( func[which] && rc == SLAP_CB_CONTINUE ) {
> >                op->o_bd->bd_info = oi->oi_orig;
> >                rc = func[which]( op, rs );
> >        }
> >        /* should not fall thru this far without anything happening... */
> >        if ( rc == SLAP_CB_CONTINUE ) {
> >                rc = op_rc[ which ];
> >        }
> >        op->o_bd = be;
> >        op->o_callback = cb.sc_next;
> >
> >  NOTICE: syncprov_op_cleanup was gone here!!!!!
> >
> >    So I copy some code from servers/slapd/result.c to call the
> >callback functions. Below is my patch. It works in my environment.
> >
> >----------------------------------------------------------------
> >
> >diff -aur openldap-2.3.1alpha.orig/servers/slapd/backover.c
> >openldap-2.3.1alpha/servers/slapd/backover.c
> >--- openldap-2.3.1alpha.orig/servers/slapd/backover.c   2005-02-03
> >01:32:43.000000000 +0800
> >+++ openldap-2.3.1alpha/servers/slapd/backover.c        2005-03-11
> >22:09:32.000000000 +0800
> >@@ -296,6 +296,27 @@
> >        if ( rc == SLAP_CB_CONTINUE ) {
> >                rc = op_rc[ which ];
> >        }
> >+
> >+       // ADD by Wayne Zhao
> >+       if ( op->o_callback ) {
> >+               int             first = 1;
> >+               slap_callback   *sc = op->o_callback,
> >+                               *sc_next = op->o_callback;
> >+
> >+               for ( sc_next = op->o_callback; sc_next;
> >op->o_callback = sc_next) {
> >+                       sc_next = op->o_callback->sc_next;
> >+                       if ( op->o_callback->sc_cleanup ) {
> >+                               (void)op->o_callback->sc_cleanup( op, rs );
> >+                       }
> >+                       if ( first && op->o_callback == NULL ) {
> >+                               sc = NULL;
> >+                       }
> >+                       first = 0;
> >+               }
> >+               op->o_callback = sc;
> >+       }
> >+       // END of ADD by Wayne Zhao
> >+
> >        op->o_bd = be;
> >        op->o_callback = cb.sc_next;
> >        return rc;
> >
> >----------------------------------------------------------------
> >
> >slapd.conf on the provider side:
> >
> >include         /usr/local/etc/openldap/schema/core.schema
> >loglevel        261
> >pidfile         ./var/slapd.pid
> >argsfile        ./var/slapd.args
> >database        bdb
> >suffix          "dc=example,dc=com"
> >rootdn          "cn=admin,dc=example,dc=com"
> >rootpw          secret
> >directory       ./db
> >index           objectClass     eq
> >overlay         syncprov
> >
> >----------------------------------------------------------------
> >
> >slapd.conf on the provider side:
> >
> >include         /usr/local/etc/openldap/schema/core.schema
> >pidfile         /var/slapd-agent.pid
> >argsfile        /var/slapd-agent.args
> >
> >database        bdb
> >suffix          "dc=example,dc=com"
> >rootdn          "cn=replica,dc=example,dc=com"
> >rootpw          secret
> >directory       /var/ldap
> >syncrepl rid=1
> >                 provider=ldap://192.168.0.218:9014
> >                 bindmethod=simple
> >                 binddn="cn=admin,dc=example,dc=com"
> >                 credentials=secret
> >                 searchbase="dc=example,dc=com"
> >                 filter="(objectClass=*)"
> >                 attrs="*"
> >                 schemachecking=off
> >                 scope=sub
> >                 type=refreshAndPersist
> >                 retry=10,+
> >updateref        "ldap://192.168.0.218:9014/";
> >overlay         syncprov
> >
> >----------------------------------------------------------------
> >
> >ldappasswd command line:
> ># ldappasswd -x -w secret -H ldap://192.168.0.218:9014/ -D
> >'cn=admin,dc=example,dc=com' 'uid=test,dc=example,dc=com' -s 123456
> >
> >----------------------------------------------------------------
> 
>