8623 – test022-ppolicy can fail due to timing dependency between lockout and expiration tests

Issue 8623 - test022-ppolicy can fail due to timing dependency between lockout and expiration tests

Summary: test022-ppolicy can fail due to timing dependency between lockout and expirat...

Status:	UNCONFIRMED

Alias:	None

Product:	OpenLDAP
Classification:	Unclassified
Component:	slapd (show other issues)
Version:	2.4.44
Hardware:	All All

Importance:	--- normal
Target Milestone:	---
Assignee:	OpenLDAP project

URL:
Keywords:

Depends on:
Blocks:

Reported:	2017-03-22 18:44 UTC by subbarao@computer.org
Modified:	2021-10-11 20:40 UTC (History)
CC List:	0 users

See Also:	9718

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this issue.

Description subbarao@computer.org 2017-03-22 18:44:00 UTC

Full_Name: Kartik Subbarao
Version: 2.4.44
OS: Linux
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (108.52.86.252)


A recent 2.4.44 package build failed on a VM due to the following timing
dependency between the lockout and expiration tests in test022-ppolicy.

The problem is that the password expiration time is 30 seconds, and on slower
VMs, lines 67-91 take more than 30 seconds to execute. This causes the password
to expire by the time the ldapsearch command on line 93 is executed. That
ldapsearch command eats up one of the grace logins that's not supposed to be
consumed until line 106. This causes the grep count on line 124 to be off,
failing the entire script.

A simple improvement would be to change line 91 to "sleep 10" instead of the
current "sleep 20". This would buy 10 more seconds of leeway on slower VMs,
while guaranteeing sufficient delay between lines 67-90 (e.g. 16 seconds, 1
second more than the 15 second lockout time) on fast VMs. A more robust
improvement would be to check how much time has actually elapsed between lines
67-90 and change line 91 to only sleep until 16 total seconds have elapsed since
line 67.

Another suggestion would be to change the "sleep 20" on line 104 to "sleep 15".
That will shave 5 seconds off the build time for everyone, while still
guaranteeing that 31 seconds (e.g. 1 second more than the 30 second expiration
time) will have elapsed between lines 67-106).

A further optimization could be implemented, if desired, by revisiting the test
and reducing all lockout times, expiration times, and other delays to the bare
minimum necessary, which would speed up package build times across the board.

Comment 1 Howard Chu 2017-03-22 19:11:01 UTC

subbarao@computer.org wrote:
> Full_Name: Kartik Subbarao
> Version: 2.4.44
> OS: Linux
> URL: ftp://ftp.openldap.org/incoming/
> Submission from: (NULL) (108.52.86.252)
>
>
> A recent 2.4.44 package build failed on a VM due to the following timing
> dependency between the lockout and expiration tests in test022-ppolicy.

Clocks on VMs are notoriously unstable, which is why we have always used such 
large margins in these tests. Shaving things down to "1 second more than 
needed" will probably break in multiple build environments.
>
> The problem is that the password expiration time is 30 seconds, and on slower
> VMs, lines 67-91 take more than 30 seconds to execute. This causes the password
> to expire by the time the ldapsearch command on line 93 is executed. That
> ldapsearch command eats up one of the grace logins that's not supposed to be
> consumed until line 106. This causes the grep count on line 124 to be off,
> failing the entire script.
>
> A simple improvement would be to change line 91 to "sleep 10" instead of the
> current "sleep 20". This would buy 10 more seconds of leeway on slower VMs,
> while guaranteeing sufficient delay between lines 67-90 (e.g. 16 seconds, 1
> second more than the 15 second lockout time) on fast VMs. A more robust
> improvement would be to check how much time has actually elapsed between lines
> 67-90 and change line 91 to only sleep until 16 total seconds have elapsed since
> line 67.
>
> Another suggestion would be to change the "sleep 20" on line 104 to "sleep 15".
> That will shave 5 seconds off the build time for everyone, while still
> guaranteeing that 31 seconds (e.g. 1 second more than the 30 second expiration
> time) will have elapsed between lines 67-106).
>
> A further optimization could be implemented, if desired, by revisiting the test
> and reducing all lockout times, expiration times, and other delays to the bare
> minimum necessary, which would speed up package build times across the board.
>
>


-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 2 Quanah Gibson-Mount 2017-03-22 19:19:17 UTC

--On Wednesday, March 22, 2017 8:11 PM +0000 hyc@symas.com wrote:

> subbarao@computer.org wrote:
>> Full_Name: Kartik Subbarao
>> Version: 2.4.44
>> OS: Linux
>> URL: ftp://ftp.openldap.org/incoming/
>> Submission from: (NULL) (108.52.86.252)
>>
>>
>> A recent 2.4.44 package build failed on a VM due to the following timing
>> dependency between the lockout and expiration tests in test022-ppolicy.
>
> Clocks on VMs are notoriously unstable, which is why we have always used
> such  large margins in these tests. Shaving things down to "1 second more
> than  needed" will probably break in multiple build environments.

Even better is to use the SLEEP variables instead of hard coded values. 
Then they can be overridden in the test environment if necessary for 
specific VMs.

At the moment we have:

quanah@ub16:~/git/openldap/openldap-head/tests/scripts$ grep SLEEP 
defines.sh
SLEEP0=${SLEEP0-1}
SLEEP1=${SLEEP1-7}
SLEEP2=${SLEEP2-15}


It may be useful to add things like SLEEP10, SLEEP20, etc, and rebase tests 
off of that.  1/7/15 seem rather odd increments (no pun intended!).

--Quanah



--

Quanah Gibson-Mount
Product Architect
Symas Corporation
Packaged, certified, and supported LDAP solutions powered by OpenLDAP:
<http://www.symas.com>

Comment 3 subbarao@computer.org 2017-03-22 20:00:56 UTC

On 03/22/2017 03:11 PM, Howard Chu wrote:
>> A recent 2.4.44 package build failed on a VM due to the following timing
>> dependency between the lockout and expiration tests in test022-ppolicy.
>
> Clocks on VMs are notoriously unstable, which is why we have always 
> used such large margins in these tests. Shaving things down to "1 
> second more than needed" will probably break in multiple build 
> environments.

Just to make sure we're understanding each other clearly -- are you 
saying that you've seen VMs where "sleep 10" won't sleep for at *least* 
10 actual seconds? I haven't come across this, but if you have, then the 
adaptive solution where line 91 checks to see how many seconds have 
elapsed since line 67, and then calculates how much more it needs to 
sleep (perhaps with some more buffer) could be effective. If that 
doesn't work, then another thought might be to widen the difference 
between pwdLockoutDuration and pwdMaxAge to enable greater separation 
between the two tests, and adjust the script accordingly.

Regards,

     -Kartik

Comment 4 subbarao@computer.org 2017-03-22 20:31:08 UTC

On 03/22/2017 03:19 PM, Quanah Gibson-Mount wrote:
> Even better is to use the SLEEP variables instead of hard coded 
> values. Then they can be overridden in the test environment if 
> necessary for specific VMs.
>
> At the moment we have:
>
> quanah@ub16:~/git/openldap/openldap-head/tests/scripts$ grep SLEEP 
> defines.sh
> SLEEP0=${SLEEP0-1}
> SLEEP1=${SLEEP1-7}
> SLEEP2=${SLEEP2-15}
>
> It may be useful to add things like SLEEP10, SLEEP20, etc, and rebase 
> tests off of that.  1/7/15 seem rather odd increments (no pun intended!). 

This could well be a very good idea for broader reasons, but it probably 
would be less desirable as a sole solution for this particular 
situation, at least for my use case. In this case, I'm backporting 
2.4.44 for zesty to 14.04 LTS. Ideally, I want this to be a clean 
rebuild, without any code changes. I'm especially reluctant to maintain 
a customized package for something that doesn't affect the behavior of 
the actual software. So having more variable(s) that need to be tuned, 
and thus changed from the vanilla Ubuntu package, isn't too appealing :-)

Hopefully there's an agreeable solution to this issue that will make it 
work for everyone out of the box. In any case, I appreciate your and 
Howard's responses to this!

Regards,

     -Kartik

Comment 5 subbarao@computer.org 2017-03-22 20:54:13 UTC

On 03/22/2017 04:00 PM, Kartik Subbarao wrote:
> On 03/22/2017 03:11 PM, Howard Chu wrote:
>>> A recent 2.4.44 package build failed on a VM due to the following 
>>> timing
>>> dependency between the lockout and expiration tests in test022-ppolicy.
>>
>> Clocks on VMs are notoriously unstable, which is why we have always 
>> used such large margins in these tests. Shaving things down to "1 
>> second more than needed" will probably break in multiple build 
>> environments.
>
> Just to make sure we're understanding each other clearly -- are you 
> saying that you've seen VMs where "sleep 10" won't sleep for at 
> *least* 10 actual seconds? I haven't come across this, [...]

Hmm, the more I think about this, the harder it is for me to envision 
the kind of failure scenario you seem to be implying. Let me spell out 
my thought process and you can point out where I'm going wrong:

Assumption: An "unstable" clock on a VM is one that doesn't advance at 
one second per second of real-world time, but at least monotonically 
increases over time (ignoring DST exceptions and rare things like that). 
If it's *not* monotonically increasing, then I don't see how all kinds 
of things wouldn't break on the system -- like cron running duplicate 
jobs, etc. OpenLDAP tests would be the least of someone's worries in 
that situation :-)

Given this assumption, then "sleep 10" on a VM should always sleep for 
at *least* 10 seconds as marked by the VM's system clock. It doesn't 
matter how that compares to 10 seconds in the real world, since 
everything running on the VM is calculated relative to the VM clock. In 
particular, both the slapd process and the test shell script are relying 
on the same VM clock. So I don't see how sleeping for "1 second more 
than needed" -- which in this case would be 16 seconds and 31 seconds 
for each section -- could cause anything to break in this particular script.

What am I missing?

Thanks,

     -Kartik

Comment 6 Michael Ströder 2017-03-22 21:03:11 UTC

Disclaimer: I did not dive into the test script. Maybe my suggestion is stupid. ;-)

How about just setting the pwdChangedTime attribute (with the help of relax rules
control) to a value in the (near?) past right before sending the bind request to be tested?

Ciao, Michael.

Comment 7 Howard Chu 2017-03-23 09:38:19 UTC

subbarao@computer.org wrote:
> On 03/22/2017 03:19 PM, Quanah Gibson-Mount wrote:
>> Even better is to use the SLEEP variables instead of hard coded
>> values. Then they can be overridden in the test environment if
>> necessary for specific VMs.
>>
>> At the moment we have:
>>
>> quanah@ub16:~/git/openldap/openldap-head/tests/scripts$ grep SLEEP
>> defines.sh
>> SLEEP0=${SLEEP0-1}
>> SLEEP1=${SLEEP1-7}
>> SLEEP2=${SLEEP2-15}
>>
>> It may be useful to add things like SLEEP10, SLEEP20, etc, and rebase
>> tests off of that.  1/7/15 seem rather odd increments (no pun intended!).
>
> This could well be a very good idea for broader reasons, but it probably
> would be less desirable as a sole solution for this particular
> situation, at least for my use case. In this case, I'm backporting
> 2.4.44 for zesty to 14.04 LTS. Ideally, I want this to be a clean
> rebuild, without any code changes.

Overriding shell variables doesn't require any code changes. It only requires 
that you set their values in whatever build script you use that issues the 
"make test" command. That is the whole reason why we defined these variables.

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 8 subbarao@computer.org 2017-03-23 14:08:54 UTC

On 03/23/2017 05:38 AM, Howard Chu wrote:
> subbarao@computer.org wrote:
>> This could well be a very good idea for broader reasons, but it probably
>> would be less desirable as a sole solution for this particular
>> situation, at least for my use case. In this case, I'm backporting
>> 2.4.44 for zesty to 14.04 LTS. Ideally, I want this to be a clean
>> rebuild, without any code changes.
> Overriding shell variables doesn't require any code changes. It only 
> requires that you set their values in whatever build script you use 
> that issues the "make test" command. That is the whole reason why we 
> defined these variables.

Hmm, I don't think you're hearing what I'm saying in its intended 
context. But let me table this for now. Can we back up to the other part 
of the thread? If you can point out what I'm missing with the VM clock 
reliability, that would help me better appreciate where you're coming from.

For example, if we were talking about tests between multiple VMs, I 
could immediately appreciate the value of being generous with timing 
margins. But as I indicated in my other message, I'm having difficulty 
seeing how those kinds of problems could arise here.

I apologize if there's something obvious I'm missing or if I'm coming 
across as belaboring an undesirable point. I'm just trying to make sure 
we don't inadvertently rule out a simple solution to the problem that 
could work for everyone.

Thanks,

     -Kartik