Full_Name: Aaron Richton Version: OPENLDAP_REL_ENG_2_4_40-178-gba9a739 OS: Linux URL: https://www.nbcs.rutgers.edu/~richton/testrun-test023-fail.tar.bz2 Submission from: (NULL) (128.6.31.135) Seeing occasional failures on test023 using mdb. Takes a few runs (~30 for me, recently) but seems to be more than a one-off...
--On Monday, March 02, 2015 10:17 PM +0000 richton@nbcs.rutgers.edu wrote: > Full_Name: Aaron Richton > Version: OPENLDAP_REL_ENG_2_4_40-178-gba9a739 > OS: Linux > URL: https://www.nbcs.rutgers.edu/~richton/testrun-test023-fail.tar.bz2 > Submission from: (NULL) (128.6.31.135) > > > Seeing occasional failures on test023 using mdb. Takes a few runs (~30 > for me, recently) but seems to be more than a one-off... Doesn't happen for me with 4129e246a85a6d1f0033466ed2c7fb51c6dee508 (Feb 6th), so it is something recently introduced. --Quanah -- Quanah Gibson-Mount Platform Architect Zimbra, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration
--On Monday, March 02, 2015 10:17 PM +0000 richton@nbcs.rutgers.edu wrote: > Full_Name: Aaron Richton > Version: OPENLDAP_REL_ENG_2_4_40-178-gba9a739 > OS: Linux > URL: https://www.nbcs.rutgers.edu/~richton/testrun-test023-fail.tar.bz2 > Submission from: (NULL) (128.6.31.135) > > > Seeing occasional failures on test023 using mdb. Takes a few runs (~30 > for me, recently) but seems to be more than a one-off... > Confirmed with Howard this is a timing dependent script, and it's affected by recent changes to the code base. refint used to do all its work inline, but now it uses a separate thread, so on a slow enough machine, it may not be done by the time the script runs its ldapsearch. I.e., doesn't appear to be a code bug. The test needs tweaking at some point to add some sleeps to handle this. :P --Quanah -- Quanah Gibson-Mount Platform Architect Zimbra, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration
On Tue, 3 Mar 2015, Quanah Gibson-Mount wrote: >> Seeing occasional failures on test023 using mdb. Takes a few runs (~30 >> for me, recently) but seems to be more than a one-off... >> > > Confirmed with Howard this is a timing dependent script, and it's > affected by recent changes to the code base. refint used to do all its > work inline, but now it uses a separate thread, so on a slow enough > machine, it may not be done by the time the script runs its ldapsearch. > > I.e., doesn't appear to be a code bug. The test needs tweaking at some > point to add some sleeps to handle this. :P Is OPENLDAP_REL_ENG_2_4_40-156-g4129e24 early enough (slight inference from your earlier email)? That one failed after 80 runs.
quanah@zimbra.com wrote: > Confirmed with Howard this is a timing dependent script, and it's affected > by recent changes to the code base. refint used to do all its work > inline, but now it uses a separate thread, so on a slow enough machine, it > may not be done by the time the script runs its ldapsearch. Hmm, I'm not convinced that it really makes sense to have slapo-refint doing its job asynchronously. I'm very much concerned about data consistency with sync jobs relying on e.g. group membership being immediately updated etc. I'd rather prefer to have slow write operations than having inconsistent references. Can this be made configurable? Please...! Also how about impact on transaction handling in upcoming 2.5.x? Ciao, Michael.
richton@nbcs.rutgers.edu wrote: > On Tue, 3 Mar 2015, Quanah Gibson-Mount wrote: > >>> Seeing occasional failures on test023 using mdb. Takes a few runs (~30 >>> for me, recently) but seems to be more than a one-off... >>> >> >> Confirmed with Howard this is a timing dependent script, and it's >> affected by recent changes to the code base. refint used to do all its >> work inline, but now it uses a separate thread, so on a slow enough >> machine, it may not be done by the time the script runs its ldapsearch. >> >> I.e., doesn't appear to be a code bug. The test needs tweaking at some >> point to add some sleeps to handle this. :P > > Is OPENLDAP_REL_ENG_2_4_40-156-g4129e24 early enough (slight inference > from your earlier email)? That one failed after 80 runs. The refint behavior (using a separate thread) is not a recent change - that was made in 2006. But the test script is even older. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
michael@stroeder.com wrote: > This is a cryptographically signed message in MIME format. > > --------------ms080008030100090401070303 > Content-Type: text/plain; charset=ISO-8859-1 > Content-Transfer-Encoding: quoted-printable > > quanah@zimbra.com wrote: >> Confirmed with Howard this is a timing dependent script, and it's affec= > ted=20 >> by recent changes to the code base. refint used to do all its work=20 >> inline, but now it uses a separate thread, so on a slow enough machine,= > it=20 >> may not be done by the time the script runs its ldapsearch. > > Hmm, I'm not convinced that it really makes sense to have slapo-refint do= > ing > its job asynchronously. I'm very much concerned about data consistency w= > ith > sync jobs relying on e.g. group membership being immediately updated etc.= > > > I'd rather prefer to have slow write operations than having inconsistent > references. Can this be made configurable? Please...! > > Also how about impact on transaction handling in upcoming 2.5.x? Good questions for 2.5. accesslog reentrancy used to be a problem, back in 2006 but other changes have been made since then to remove that obstacle. Using LDAP txns internally would also fix the backendDB reentrancy issues. What sync jobs are you referring to? LDAP replication only promises eventual consistency; relying on instantaneous/continuous consistency here means you've got a design flaw. > Ciao, Michael. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
--On Tuesday, March 03, 2015 9:40 PM -0500 Aaron Richton <richton@nbcs.rutgers.edu> wrote: > Is OPENLDAP_REL_ENG_2_4_40-156-g4129e24 early enough (slight inference > from your earlier email)? That one failed after 80 runs. I think the answer is what Howard noted -- This test can fail on any release from 2006 and forward. :P --Quanah -- Quanah Gibson-Mount Platform Architect Zimbra, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration
test script issue
changed notes moved from Incoming to Build
Maybe something we can fix with reworking the test suite into python or similar.