Issue 8070 - test023 failures
Summary: test023 failures
Status: UNCONFIRMED
Alias: None
Product: OpenLDAP
Classification: Unclassified
Component: build (show other issues)
Version: unspecified
Hardware: All All
: --- normal
Target Milestone: 2.7.0
Assignee: OpenLDAP project
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-03-02 22:17 UTC by richton@nbcs.rutgers.edu
Modified: 2021-06-23 15:53 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description richton@nbcs.rutgers.edu 2015-03-02 22:17:12 UTC
Full_Name: Aaron Richton
Version: OPENLDAP_REL_ENG_2_4_40-178-gba9a739
OS: Linux
URL: https://www.nbcs.rutgers.edu/~richton/testrun-test023-fail.tar.bz2
Submission from: (NULL) (128.6.31.135)


Seeing occasional failures on test023 using mdb. Takes a few runs (~30 for me,
recently) but seems to be more than a one-off...
Comment 1 Quanah Gibson-Mount 2015-03-03 16:37:31 UTC
--On Monday, March 02, 2015 10:17 PM +0000 richton@nbcs.rutgers.edu wrote:

> Full_Name: Aaron Richton
> Version: OPENLDAP_REL_ENG_2_4_40-178-gba9a739
> OS: Linux
> URL: https://www.nbcs.rutgers.edu/~richton/testrun-test023-fail.tar.bz2
> Submission from: (NULL) (128.6.31.135)
>
>
> Seeing occasional failures on test023 using mdb. Takes a few runs (~30
> for me, recently) but seems to be more than a one-off...

Doesn't happen for me with 4129e246a85a6d1f0033466ed2c7fb51c6dee508 (Feb 
6th), so it is something recently introduced.

--Quanah

--

Quanah Gibson-Mount
Platform Architect
Zimbra, Inc.
--------------------
Zimbra ::  the leader in open source messaging and collaboration

Comment 2 Quanah Gibson-Mount 2015-03-03 23:14:27 UTC
--On Monday, March 02, 2015 10:17 PM +0000 richton@nbcs.rutgers.edu wrote:

> Full_Name: Aaron Richton
> Version: OPENLDAP_REL_ENG_2_4_40-178-gba9a739
> OS: Linux
> URL: https://www.nbcs.rutgers.edu/~richton/testrun-test023-fail.tar.bz2
> Submission from: (NULL) (128.6.31.135)
>
>
> Seeing occasional failures on test023 using mdb. Takes a few runs (~30
> for me, recently) but seems to be more than a one-off...
>

Confirmed with Howard this is a timing dependent script, and it's affected 
by recent changes to the code base.   refint used to do all its work 
inline, but now it uses a separate thread, so on a slow enough machine, it 
may not be done by the time the script runs its ldapsearch.

I.e., doesn't appear to be a code bug.  The test needs tweaking at some 
point to add some sleeps to handle this. :P

--Quanah

--

Quanah Gibson-Mount
Platform Architect
Zimbra, Inc.
--------------------
Zimbra ::  the leader in open source messaging and collaboration

Comment 3 richton@nbcs.rutgers.edu 2015-03-04 02:40:07 UTC
On Tue, 3 Mar 2015, Quanah Gibson-Mount wrote:

>> Seeing occasional failures on test023 using mdb. Takes a few runs (~30 
>> for me, recently) but seems to be more than a one-off...
>> 
>
> Confirmed with Howard this is a timing dependent script, and it's 
> affected by recent changes to the code base.  refint used to do all its 
> work inline, but now it uses a separate thread, so on a slow enough 
> machine, it may not be done by the time the script runs its ldapsearch.
>
> I.e., doesn't appear to be a code bug.  The test needs tweaking at some 
> point to add some sleeps to handle this. :P

Is OPENLDAP_REL_ENG_2_4_40-156-g4129e24 early enough (slight inference 
from your earlier email)? That one failed after 80 runs.


Comment 4 Michael Ströder 2015-03-04 09:24:03 UTC
quanah@zimbra.com wrote:
> Confirmed with Howard this is a timing dependent script, and it's affected 
> by recent changes to the code base.   refint used to do all its work 
> inline, but now it uses a separate thread, so on a slow enough machine, it 
> may not be done by the time the script runs its ldapsearch.

Hmm, I'm not convinced that it really makes sense to have slapo-refint doing
its job asynchronously.  I'm very much concerned about data consistency with
sync jobs relying on e.g. group membership being immediately updated etc.

I'd rather prefer to have slow write operations than having inconsistent
references.  Can this be made configurable?  Please...!

Also how about impact on transaction handling in upcoming 2.5.x?

Ciao, Michael.

Comment 5 Howard Chu 2015-03-04 11:19:10 UTC
richton@nbcs.rutgers.edu wrote:
> On Tue, 3 Mar 2015, Quanah Gibson-Mount wrote:
>
>>> Seeing occasional failures on test023 using mdb. Takes a few runs (~30
>>> for me, recently) but seems to be more than a one-off...
>>>
>>
>> Confirmed with Howard this is a timing dependent script, and it's
>> affected by recent changes to the code base.  refint used to do all its
>> work inline, but now it uses a separate thread, so on a slow enough
>> machine, it may not be done by the time the script runs its ldapsearch.
>>
>> I.e., doesn't appear to be a code bug.  The test needs tweaking at some
>> point to add some sleeps to handle this. :P
>
> Is OPENLDAP_REL_ENG_2_4_40-156-g4129e24 early enough (slight inference
> from your earlier email)? That one failed after 80 runs.

The refint behavior (using a separate thread) is not a recent change - that was made in 2006. But the test script is even older.

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 6 Howard Chu 2015-03-04 13:49:33 UTC
michael@stroeder.com wrote:
> This is a cryptographically signed message in MIME format.
>
> --------------ms080008030100090401070303
> Content-Type: text/plain; charset=ISO-8859-1
> Content-Transfer-Encoding: quoted-printable
>
> quanah@zimbra.com wrote:
>> Confirmed with Howard this is a timing dependent script, and it's affec=
> ted=20
>> by recent changes to the code base.   refint used to do all its work=20
>> inline, but now it uses a separate thread, so on a slow enough machine,=
>   it=20
>> may not be done by the time the script runs its ldapsearch.
>
> Hmm, I'm not convinced that it really makes sense to have slapo-refint do=
> ing
> its job asynchronously.  I'm very much concerned about data consistency w=
> ith
> sync jobs relying on e.g. group membership being immediately updated etc.=
>
>
> I'd rather prefer to have slow write operations than having inconsistent
> references.  Can this be made configurable?  Please...!
>
> Also how about impact on transaction handling in upcoming 2.5.x?

Good questions for 2.5.

accesslog reentrancy used to be a problem, back in 2006 but other changes have been made since then to remove that obstacle. Using LDAP txns internally would also fix the backendDB reentrancy issues.

What sync jobs are you referring to? LDAP replication only promises eventual consistency; relying on instantaneous/continuous consistency here means you've got a design flaw.

> Ciao, Michael.


-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 7 Quanah Gibson-Mount 2015-03-09 18:02:19 UTC
--On Tuesday, March 03, 2015 9:40 PM -0500 Aaron Richton 
<richton@nbcs.rutgers.edu> wrote:

> Is OPENLDAP_REL_ENG_2_4_40-156-g4129e24 early enough (slight inference
> from your earlier email)? That one failed after 80 runs.

I think the answer is what Howard noted -- This test can fail on any 
release from 2006 and forward. :P

--Quanah



--

Quanah Gibson-Mount
Platform Architect
Zimbra, Inc.
--------------------
Zimbra ::  the leader in open source messaging and collaboration

Comment 8 OpenLDAP project 2015-03-09 18:02:54 UTC
test script issue
Comment 9 Quanah Gibson-Mount 2015-03-09 18:02:54 UTC
changed notes
moved from Incoming to Build
Comment 10 Quanah Gibson-Mount 2020-03-20 22:40:30 UTC
Maybe something we can fix with reworking the test suite into python or similar.