Issue 8721 - Quarantine in meta backend
Summary: Quarantine in meta backend
Status: VERIFIED FIXED
Alias: None
Product: OpenLDAP
Classification: Unclassified
Component: backends (show other issues)
Version: 2.4.45
Hardware: All All
: --- normal
Target Milestone: 2.5.5
Assignee: Ondřej Kuzník
URL:
Keywords:
: 8268 (view as issue list)
Depends on:
Blocks:
 
Reported: 2017-09-04 08:58 UTC by michele.lugli@unife.it
Modified: 2021-06-23 15:36 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description michele.lugli@unife.it 2017-09-04 08:58:17 UTC
Full_Name: Michele Lugli
Version: 2.4.45
OS: Linux Ubuntu server 16.04
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (192.167.209.10)


Quarantine function in back-meta will permanently disable a target if a query is
received while the quarantine has been imposed.

Same problem reported in ITS#5592 and marked as "solved": still exists under
some circustances.

To reproduce: use config below, assume the first target works correctly and the
second is unavailable. Send a query and the quarantine will be
set. Sending more queries, with searchbase "dc=example,dc=com" and within 20
seconds, the quarantine will never lift.

Tested on latest release available (commit
bb62d9cb732c894023c5c9f5893acf40add7376c - Aug 31 16:53:45 2017).


---- slapd.conf ----

database   meta
suffix     dc=example,dc=com
quarantine 20,+
uri        ldap://itworks.example.com/dc=subtree1,dc=example,dc=com
uri        ldap://afakeaddress/dc=subtree2,dc=example,dc=com


---- slapd logs ----

59ad1507 conn=1000 op=1 meta_back_quarantine[1]: enter.
59ad1519 conn=1001 op=1: meta_back_getconn[1] quarantined
59ad1527 conn=1002 op=1: meta_back_getconn[1] quarantined
59ad1539 conn=1003 op=1: meta_back_getconn[1] quarantined
59ad154b conn=1004 op=1: meta_back_getconn[1] quarantined
....
Comment 1 OpenLDAP project 2019-04-17 21:04:49 UTC
See also ITS#5592
Comment 2 Quanah Gibson-Mount 2019-04-17 21:04:49 UTC
changed notes
Comment 3 Shawn McKinney 2021-04-12 17:04:02 UTC
Cannot reproduce scenario.  

Details:

$OpenLDAP: slapd 2.5.3

Proxy config:
```
gsfile    "/var/run/openldap/slapd.args"

loglevel    stats sync
threads     8

### modules
modulepath /opt/openldap25/lib/openldap
moduleload back_meta
moduleload back_ldap

### schemas
include /opt/openldap25/etc/openldap/schema/core.schema
include /opt/openldap25/etc/openldap/schema/cosine.schema

sizelimit unlimited
timelimit unlimited

database   meta
suffix      ""
quarantine 20,+
rootdn      "dc=example,dc=com"
rootpw      "F00F1ghters"

uri        ldap://dewey/ou=Groups,dc=example,dc=com
uri        "ldap://louie/ou=bar,dc=example,dc=com"

access to *
     by * read
```

1st search (all three servers running):

```
$ ldapsearch -H ldap://huey -D "dc=example,dc=com" -w F00F1ghters -b "dc=example,dc=com" -s sub  objectclass=*


# Groups, example.com
dn: ou=Groups,dc=example,dc=com
objectClass: organizationalUnit
ou: Groups
description: Group container

# foo, Groups, example.com
dn: cn=foo,ou=Groups,dc=example,dc=com
member: cn=service-user,ou=admin,dc=example,dc=com
cn: foo
objectClass: groupOfNames
objectClass: top

# bar, example.com
dn: ou=bar,dc=example,dc=com
ou: bar
objectClass: organizationalUnit
objectClass: top

# bar, bar, example.com
dn: cn=bar,ou=bar,dc=example,dc=com
cn: bar
sn: bar
objectClass: person
objectClass: top

# search result
search: 2
result: 0 Success

# numResponses: 5
# numEntries: 4
```

2nd search, louie not running, same search op:

```
# Groups, example.com
dn: ou=Groups,dc=example,dc=com
objectClass: organizationalUnit
ou: Groups
description: Group container

# foo, Groups, example.com
dn: cn=foo,ou=Groups,dc=example,dc=com
member: cn=service-user,ou=admin,dc=example,dc=com
cn: foo
objectClass: groupOfNames
objectClass: top

# search result
search: 2
result: 0 Success

# numResponses: 3
# numEntries: 2
```

Proxy server logs shows louie has been quarantined:

```
Apr 12 16:59:41 huey slapd[110695]: conn=1004 op=1 SRCH base="dc=example,dc=com" scope=2 deref=0 filter="(objectClass=*)"                                                                                                      
Apr 12 16:59:41 huey slapd[110695]: conn=1004 op=1 meta_back_retry[1]: retrying URI="ldap://louie" DN="".
Apr 12 16:59:41 huey slapd[110695]: conn=1004 op=1 meta_back_quarantine[1]: enter.
```

3rd search, louie restarted, after waiting 20 seconds:

proxy server, louie exits quarantine:

```
Apr 12 17:02:01 huey slapd[110695]: conn=1005 op=1 SRCH base="dc=example,dc=com" scope=2 deref=0 filter="(objectClass=*)"
Apr 12 17:02:01 huey slapd[110695]: conn=1005 op=1 meta_back_init_one_conn[1]: quarantine retry block #0 try #0.
Apr 12 17:02:01 huey slapd[110695]: conn=1005 op=1 SEARCH RESULT tag=101 err=0 qtime=0.000020 etime=0.004103 nentries=4 text=
Apr 12 17:02:01 huey slapd[110695]: conn=1005 op=1 meta_back_quarantine[1]: exit.
```

search returns all results as expected:


```
# Groups, example.com
dn: ou=Groups,dc=example,dc=com
objectClass: organizationalUnit
ou: Groups
description: Group container

# foo, Groups, example.com
dn: cn=foo,ou=Groups,dc=example,dc=com
member: cn=service-user,ou=admin,dc=example,dc=com
cn: foo
objectClass: groupOfNames
objectClass: top

# bar, example.com
dn: ou=bar,dc=example,dc=com
ou: bar
objectClass: organizationalUnit
objectClass: top

# bar, bar, example.com
dn: cn=bar,ou=bar,dc=example,dc=com
cn: bar
sn: bar
objectClass: person
objectClass: top

# search result
search: 2
result: 0 Success

# numResponses: 5
# numEntries: 4
```
Comment 4 Shawn McKinney 2021-04-12 17:23:30 UTC
Can reproduce proxy not lifting quarantine if the searches continue during quarantine period.

Once the operations stop, the quarantine will lift after the specified elapsed time.

For example:

Three servers:
a. huey (proxy)
b. dewey (backend server)
c. louie (backend server)

1. Stop louie
2. Issue search to huey, only dewey's entries in result set (as expected)
3. Start louie
4. Issue searches before louie's quarantine expires (e.g. 20 seconds)
5. Louie will remain in quarantine while the search ops continue
6. Stop search ops
7. Wait 20 seconds
8. Louie's quarantine expires
9. Start searches again, all of dewey and louie entries now in result set
Comment 5 Shawn McKinney 2021-04-14 15:28:43 UTC
It has been observed that the quarantine is reset on each search, even if the server is restarted.

Another scenario

1. stop louie
2. perform search, 20 second quarantine starts
3. start louie
4. perform another search before quarantine lifts, it resets to another 20 seconds
Comment 7 Quanah Gibson-Mount 2021-05-10 15:10:41 UTC
Commits: 
  • 180092fe 
by Ondřej Kuzník at 2021-05-07T19:26:19+00:00 
ITS#8721 Regression test


  • de0caafe 
by Ondřej Kuzník at 2021-05-07T19:26:19+00:00 
ITS#8721 Do not update ri_last unless we're actually retrying
Comment 8 Quanah Gibson-Mount 2021-06-23 15:36:52 UTC
*** Issue 8268 has been marked as a duplicate of this issue. ***