Issue 8668 - pcache failing to update after entry moved
Summary: pcache failing to update after entry moved
Status: VERIFIED INVALID
Alias: None
Product: OpenLDAP
Classification: Unclassified
Component: overlays (show other issues)
Version: unspecified
Hardware: All All
: --- normal
Target Milestone: ---
Assignee: Howard Chu
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-06-05 17:08 UTC by acrow@integrafin.co.uk
Modified: 2021-03-22 23:13 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description acrow@integrafin.co.uk 2017-06-05 17:08:10 UTC
Full_Name: Alex Crow
Version: 2.4.40-13.el7
OS: Centos 7.3
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (95.172.237.70)


I'm using OpenLDAP with the caching overlay as a proxy to AD, mostly for use
with Postfix and Dovecot.

I have been experiencing a strange issue whereby, when a user is moved to a
different OU in AD, the caching server initially returns only the original OU
until the cache entry expires. However, after this time, it returns both the
entry in the original OU and the entry in the new OU. This does not seem to
change even after the next expiry time has elapsed. I can only seem to clear out
the "old" result by wiping the cache's database.

Here is my slapd.conf:

### Schema includes ###########################################################
include                 /etc/openldap/schema/core.schema
include                 /etc/openldap/schema/cosine.schema
include /etc/openldap/schema/inetorgperson.schema
include                 /etc/openldap/schema/misc.schema
#include                 /etc/openldap/schema/nis.schema
include                 /etc/openldap/schema/custom.schema
include                 /etc/openldap/schema/adstuff.schema

## Module paths ##############################################################
modulepath              /usr/lib64/openldap/
moduleload              back_ldap
moduleload              pcache
#moduleload             rwm

# Main settings ###############################################################
TLSCACertificateFile /etc/openldap/cacerts/cacertchain.pem
TLSCertificateFile /etc/openldap/cacerts/certkey.pem
TLSCertificateKeyFile /etc/openldap/cacerts/certkey.pem
TLSVerifyClient   never

pidfile                 /var/run/openldap/slapd.pid
argsfile                /var/run/openldap/slapd.args
allow                   bind_v2

database config
rootdn "cn=admin,cn=config"
rootpw {SSHA}blahblahblah

### Database definition (Proxy to AD) #########################################
database                ldap
readonly                yes
protocol-version        3
rebind-as-user
uri                     "ldap://foo ldap://bar ..."
suffix                  "dc=foo,dc=bar,dc=net"
rootdn                  "dc=foo,dc=bar,dc=net"
timelimit               5


overlay pcache
pcache    bdb 100000 1 1000 100
pcacheAttrset 0 mail x-mailHost x-mailStore unixHomeDirectory
pcacheTemplate (sn=) 0 3600 0 0 1800
pcacheTemplate (cn=) 0 3600 0 0 1800
pcacheTemplate (mail=) 0 3600 0 0 1800
pcacheTemplate (&(objectClass=)(mail=)) 0 3600 0 0 1800
pcacheTemplate (&(objectClass=)(mail=*)) 0 3600 0 0 1800

cachesize 10000
directory /var/lib/ldap
index       objectClass eq
index       cn,sn,uid,mail  pres,eq,sub

### Logging ###################################################################
loglevel                0


Here is an example of a search returning two results from the cache:

# extended LDIF
#
# LDAPv3
# base <OU=baz,DC=foo,DC=bar,DC=net> with scope subtree
# filter: mail=test_ajc@integrafin.co.uk
# requesting: x-mailHost
#

# test_ajc, DMD, COPS, ...
dn: cn=test_ajc,ou=DMD,ou=COPS, ...
 dc=bar,dc=net
x-mailHost: imap.bar.net

# test_ajc, SysAdmin, ITDIV, ...
dn: cn=test_ajc,ou=SysAdmin,ou=ITDIV, ...
 dc=bar,dc=net
x-mailHost: imap.bar.net

# search result
search: 2
result: 0 Success

# numResponses: 3
# numEntries: 2

The newer, correct entry is the lower one.

We also occasionally suffer segfaults, eg:

[8432930.512516] slapd[19550]: segfault at 108 ip 00007f4204c401de sp
00007f41c1ff94d0 error 6 in libldap_r-2.4.so.2.10.3[7f4204c18000+56000]
[8434338.469945] slapd[30666]: segfault at 108 ip 00007f102a5c41de sp
00007f1014c744d0 error 6 in libldap_r-2.4.so.2.10.3[7f102a59c000+56000]
[8951331.245103] slapd[9653]: segfault at 11d8 ip 00007f01c523d1de sp
00007f01abffd4d0 error 6 in libldap_r-2.4.so.2.10.3[7f01c5215000+56000]
[10140511.797794] slapd[10247]: segfault at 108 ip 00007fbc84de01de sp
00007fbc477fc4d0 error 6 in libldap_r-2.4.so.2.10.3[7fbc84db8000+56000]

I've not determined what, if anything specific triggers these.

Any insights much appreciated.

Alex 
Comment 1 Quanah Gibson-Mount 2017-06-06 15:42:47 UTC
--On Monday, June 05, 2017 6:08 PM +0000 acrow@integrafin.co.uk wrote:

> Full_Name: Alex Crow
> Version: 2.4.40-13.el7
> OS: Centos 7.3
> URL: ftp://ftp.openldap.org/incoming/
> Submission from: (NULL) (95.172.237.70)
>
>
> I'm using OpenLDAP with the caching overlay as a proxy to AD, mostly for
> use with Postfix and Dovecot.
>
> I have been experiencing a strange issue whereby, when a user is moved to
> a different OU in AD, the caching server initially returns only the
> original OU until the cache entry expires. However, after this time, it
> returns both the entry in the original OU and the entry in the new OU.
> This does not seem to change even after the next expiry time has elapsed.
> I can only seem to clear out the "old" result by wiping the cache's
> database.

Hi Alex,

The first thing to do would be to upgrade to OpenLDAP 2.4.44 or 2.4.45 and 
confirm you can reproduce the issue in a current release.  If you can, then 
you need to provide a full backtrace, where debug symbols are enabled (the 
"-g" flag for CFLAGS for gcc), and the slapd binary is not stripped (or if 
using packaged RPMs, the debuginfo etc bits are installed).

You can grab pre-compiled packages for OpenLDAP 2.4.44 from the LTB project 
at <http://ltb-project.org/wiki/download#openldap>.  I expect they'll have 
2.4.45 packages available soon as well.

Thanks,
Quanah


--

Quanah Gibson-Mount
Product Architect
Symas Corporation
Packaged, certified, and supported LDAP solutions powered by OpenLDAP:
<http://www.symas.com>


Comment 2 acrow@integrafin.co.uk 2017-06-19 13:36:43 UTC
On 06/06/17 16:42, quanah@symas.com wrote:
> --On Monday, June 05, 2017 6:08 PM +0000 acrow@integrafin.co.uk wrote:
>
>> Full_Name: Alex Crow
>> Version: 2.4.40-13.el7
>> OS: Centos 7.3
>> URL: ftp://ftp.openldap.org/incoming/
>> Submission from: (NULL) (95.172.237.70)
>>
>>
>> I'm using OpenLDAP with the caching overlay as a proxy to AD, mostly for
>> use with Postfix and Dovecot.
>>
>> I have been experiencing a strange issue whereby, when a user is moved to
>> a different OU in AD, the caching server initially returns only the
>> original OU until the cache entry expires. However, after this time, it
>> returns both the entry in the original OU and the entry in the new OU.
>> This does not seem to change even after the next expiry time has elapsed.
>> I can only seem to clear out the "old" result by wiping the cache's
>> database.
> Hi Alex,
>
> The first thing to do would be to upgrade to OpenLDAP 2.4.44 or 2.4.45 and
> confirm you can reproduce the issue in a current release.  If you can, then
> you need to provide a full backtrace, where debug symbols are enabled (the
> "-g" flag for CFLAGS for gcc), and the slapd binary is not stripped (or if
> using packaged RPMs, the debuginfo etc bits are installed).
>
> You can grab pre-compiled packages for OpenLDAP 2.4.44 from the LTB project
> at <http://ltb-project.org/wiki/download#openldap>.  I expect they'll have
> 2.4.45 packages available soon as well.
>
> Thanks,
> Quanah
>
Hi Quanah,

I've installed a new VM with the current LTB release and the crashes 
don't seem to be happening. However, we're still seeing problems with 
the cache overlay not working as expected. We had a user account moved 
between OUs this morning shortly after 9am, and now at after 2pm the 
cache is still returning both the entry from the old OU and the new one 
(where postfix can only deal with one result in this case).

Given my configuration I'd expect the cache to at least return the 
correct result after an hour at the most - or have I made a mistake?

pcacheAttrset 0 mail x-mailHost x-mailStore unixHomeDirectory displayName
pcacheTemplate (sn=) 0 3600 0 0 1800
pcacheTemplate (cn=) 0 3600 0 0 1800
pcacheTemplate (displayName=) 0 3600 0 0 1800
pcacheTemplate (mail=) 0 3600 0 0 1800
pcacheTemplate (&(objectClass=)(mail=)) 0 3600 0 0 1800
pcacheTemplate (&(objectClass=)(mail=*)) 0 3600 0 0 1800

Postfix is set up as below:

ldaptransport_server_host = ldap://ldapcache2
                   ldap://ldapcache
ldaptransport_timeout = 2
ldaptransport_search_base = OU=foo,DC=bar,DC=baz,DC=net
ldaptransport_query_filter = (mail=%s)
ldaptransport_result_attribute = x-mailHost
ldaptransport_result_filter = relay:%s
ldaptransport_expansion_limit = 1

Best regards

Alex

> --
>
> Quanah Gibson-Mount
> Product Architect
> Symas Corporation
> Packaged, certified, and supported LDAP solutions powered by OpenLDAP:
> <http://www.symas.com>
>
>
>
>

--
This message is intended only for the addressee and may contain
confidential information. Unless you are that person, you may not
disclose its contents or use it in any way and are requested to delete
the message along with any attachments and notify us immediately.
This email is not intended to, nor should it be taken to, constitute advice.
The information provided is correct to our knowledge & belief and must not
be used as a substitute for obtaining tax, regulatory, investment, legal or
any other appropriate advice.

"Transact" is operated by Integrated Financial Arrangements Ltd.
29 Clement's Lane, London EC4N 7AE. Tel: (020) 7608 4900 Fax: (020) 7608 5300.
(Registered office: as above; Registered in England and Wales under
number: 3727592). Authorised and regulated by the Financial Conduct
Authority (entered on the Financial Services Register; no. 190856).

Comment 3 Howard Chu 2021-03-22 19:12:49 UTC
I was able to reproduce the behavior, but there is no bug here.

You're seeing the old entry because it's left over from a previous run of slapd.
The pcache overlay will forget what it knows about the cache contents on a restart, unless you configure pcachePersist. When you enable this setting,
then it remembers its cached queries across restarts and will expire old
entries instead of returning them in new search requests.
Comment 4 Howard Chu 2021-03-22 19:17:37 UTC
(In reply to Howard Chu from comment #3)
> I was able to reproduce the behavior, but there is no bug here.
> 
> You're seeing the old entry because it's left over from a previous run of
> slapd.
> The pcache overlay will forget what it knows about the cache contents on a
> restart, unless you configure pcachePersist. When you enable this setting,
> then it remembers its cached queries across restarts and will expire old
> entries instead of returning them in new search requests.

To be clear: I could only reproduce this behavior if slapd was stopped and restarted between queries:

1: perform a cacheable search
2: shutdown slapd and restart before cached query expires
3: move entry on remote server
4: repeat the cacheable search
5: repeat the cacheable search again

On step 1, pcache will query the remote and store a copy of the returned entry.
On step 2, the restart will make pcache forget it stored anything.
On step 4, pcache will query the remote server because it thinks its cache is empty, and store and return the new result (only).
On step 5, pcache will serve the request entirely from cache, and return both the old and new entries.