Issue 6243 - back-monitor fails to report entry cache usage
Summary: back-monitor fails to report entry cache usage
Status: RESOLVED PARTIAL
Alias: None
Product: OpenLDAP
Classification: Unclassified
Component: slapd (show other issues)
Version: 2.4.17
Hardware: All All
: --- normal
Target Milestone: ---
Assignee: OpenLDAP project
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-08-05 17:35 UTC by Quanah Gibson-Mount
Modified: 2014-08-01 21:05 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description Quanah Gibson-Mount 2009-08-05 17:35:57 UTC
Full_Name: Quanah Gibson-Mount
Version: 2.4.17
OS: Linux 2.6
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (69.235.224.133)


Querying back monitor to examine the values of the various caches reports the
entry cache as empty:

# Frontend, Databases, Monitor
dn: cn=Frontend,cn=Databases,cn=Monitor
structuralObjectClass: monitoredObject
creatorsName: cn=config
modifiersName: cn=config
createTimestamp: 20090805044038Z
modifyTimestamp: 20090805044038Z
monitoredInfo: frontend
monitorIsShadow: FALSE
namingContexts:
readOnly: FALSE
olmBDBEntryCache: 0
olmBDBDNCache: 592
olmBDBIDLCache: 502
olmDbDirectory: /opt/zimbra/data/ldap/hdb/db/
entryDN: cn=Frontend,cn=Databases,cn=Monitor
subschemaSubentry: cn=Subschema
hasSubordinates: FALSE

Comment 1 Quanah Gibson-Mount 2009-08-05 18:03:35 UTC

--On August 5, 2009 5:35:57 PM +0000 quanah@zimbra.com wrote:

> Full_Name: Quanah Gibson-Mount
> Version: 2.4.17
> OS: Linux 2.6
> URL: ftp://ftp.openldap.org/incoming/
> Submission from: (NULL) (69.235.224.133)
>
>
> Querying back monitor to examine the values of the various caches reports
> the entry cache as empty:
>
># Frontend, Databases, Monitor
> dn: cn=Frontend,cn=Databases,cn=Monitor
> structuralObjectClass: monitoredObject
> creatorsName: cn=config
> modifiersName: cn=config
> createTimestamp: 20090805044038Z
> modifyTimestamp: 20090805044038Z
> monitoredInfo: frontend
> monitorIsShadow: FALSE
> namingContexts:
> readOnly: FALSE
> olmBDBEntryCache: 0
> olmBDBDNCache: 592
> olmBDBIDLCache: 502
> olmDbDirectory: /opt/zimbra/data/ldap/hdb/db/
> entryDN: cn=Frontend,cn=Databases,cn=Monitor
> subschemaSubentry: cn=Subschema
> hasSubordinates: FALSE


This may be specific to glued databases (databases rooted at "").  I see 
that it's reporting the caches as under the Frontend database, instead of 
the actual BDB database, which is database 3, and shows no cache 
information:

# Database 3, Databases, Monitor
dn: cn=Database 3,cn=Databases,cn=Monitor
structuralObjectClass: monitoredObject
creatorsName: cn=config
modifiersName: cn=config
createTimestamp: 20090805044038Z
modifyTimestamp: 20090805044038Z
monitoredInfo: hdb
monitorIsShadow: FALSE
namingContexts:
readOnly: FALSE
monitorOverlay: accesslog
monitorOverlay: syncprov
entryDN: cn=Database 3,cn=Databases,cn=Monitor
subschemaSubentry: cn=Subschema
hasSubordinates: TRUE

I see the same behavior on the replica, where it is Database 2 instead of 
database 3, as it has no accesslog DB.  On the master, the accesslog DB 
caches are correct:

# Database 2, Databases, Monitor
dn: cn=Database 2,cn=Databases,cn=Monitor
structuralObjectClass: monitoredObject
creatorsName: cn=config
modifiersName: cn=config
createTimestamp: 20090805044038Z
modifyTimestamp: 20090805044038Z
monitoredInfo: hdb
monitorIsShadow: FALSE
namingContexts: cn=accesslog
readOnly: FALSE
monitorOverlay: syncprov
olmBDBEntryCache: 297
olmBDBDNCache: 517
olmBDBIDLCache: 9
olmDbDirectory: /opt/zimbra/data/ldap/accesslog/db/
entryDN: cn=Database 2,cn=Databases,cn=Monitor
subschemaSubentry: cn=Subschema
hasSubordinates: TRUE

--Quanah


--

Quanah Gibson-Mount
Principal Software Engineer
Zimbra, Inc
--------------------
Zimbra ::  the leader in open source messaging and collaboration

Comment 2 ando@openldap.org 2009-08-05 19:17:27 UTC
> This may be specific to glued databases (databases rooted at "").

The problem could be partially addressed by telling back-bdb (who's
maintaining this data in the monitor backend) to check subordinates as
soon as it discovers it's a glue instance.  However, this poses two
different problems:

- is the aggregate information resulting from adding all glued databases
cache usage still useful?

- what happens if heterogeneous databases are glued?  Significantly, what
if the superior database is not bdb/hdb?

Probably, the monitor database should also present subordinate databases
as separate entries.

p.

Comment 3 Quanah Gibson-Mount 2009-08-06 16:40:00 UTC

--On August 5, 2009 7:17:43 PM +0000 masarati@aero.polimi.it wrote:

>> This may be specific to glued databases (databases rooted at "").
>
> The problem could be partially addressed by telling back-bdb (who's
> maintaining this data in the monitor backend) to check subordinates as
> soon as it discovers it's a glue instance.  However, this poses two
> different problems:
>
> - is the aggregate information resulting from adding all glued databases
> cache usage still useful?
>
> - what happens if heterogeneous databases are glued?  Significantly, what
> if the superior database is not bdb/hdb?
>
> Probably, the monitor database should also present subordinate databases
> as separate entries.

Interestingly, in my case, there's only one real database in play for the 
glue as it is, since it's rooted at "".  All other database definitions 
come before it (cn=config, cn=accesslog, cn=monitor).

I'm also curious why back-monitor develops stats for the other caches but 
specifically not for entry cache.

--Quanah


--

Quanah Gibson-Mount
Principal Software Engineer
Zimbra, Inc
--------------------
Zimbra ::  the leader in open source messaging and collaboration

Comment 4 ando@openldap.org 2009-08-06 17:25:16 UTC
>
>
> --On August 5, 2009 7:17:43 PM +0000 masarati@aero.polimi.it wrote:
>
>>> This may be specific to glued databases (databases rooted at "").
>>
>> The problem could be partially addressed by telling back-bdb (who's
>> maintaining this data in the monitor backend) to check subordinates as
>> soon as it discovers it's a glue instance.  However, this poses two
>> different problems:
>>
>> - is the aggregate information resulting from adding all glued databases
>> cache usage still useful?
>>
>> - what happens if heterogeneous databases are glued?  Significantly,
>> what
>> if the superior database is not bdb/hdb?
>>
>> Probably, the monitor database should also present subordinate databases
>> as separate entries.
>
> Interestingly, in my case, there's only one real database in play for the
> glue as it is, since it's rooted at "".  All other database definitions
> come before it (cn=config, cn=accesslog, cn=monitor).

but... are you actually gluing something?  In any case, this is not
specific to the case of empty suffix in the glue database.  I could easily
reproduce it with a "normal" glued setup, and I was about to start fixing
things when the two above questions came to my mind.

> I'm also curious why back-monitor develops stats for the other caches but
> specifically not for entry cache.

I haven't looked in detail, but it makes sense that some operation occurs
within the glue database which requires caching something, but not
entries.

- The entry cache monitor shows the value of bdb->bi_cache.c_cursize;

- the DN cache monitor shows the value of bdb->bi_cache.c_eiused, which
should be the number of entryinfo structures used;

- the IDL cache monitor shows the value of bdb->bi_idl_cache_size.

In my very simple tests, I only saw something populating the DN cache,
which means some internal operation required to allocate some entryinfo
structures that remain 'round.

In any case, it's only showing information related to its database
structure, it is by no means collecting info from the glued databases.

p.

Comment 5 Quanah Gibson-Mount 2009-08-06 17:33:03 UTC

--On August 6, 2009 7:25:16 PM +0200 masarati@aero.polimi.it wrote:


>> I'm also curious why back-monitor develops stats for the other caches but
>> specifically not for entry cache.
>
> I haven't looked in detail, but it makes sense that some operation occurs
> within the glue database which requires caching something, but not
> entries.
>
> - The entry cache monitor shows the value of bdb->bi_cache.c_cursize;
>
> - the DN cache monitor shows the value of bdb->bi_cache.c_eiused, which
> should be the number of entryinfo structures used;
>
> - the IDL cache monitor shows the value of bdb->bi_idl_cache_size.
>
> In my very simple tests, I only saw something populating the DN cache,
> which means some internal operation required to allocate some entryinfo
> structures that remain 'round.
>
> In any case, it's only showing information related to its database
> structure, it is by no means collecting info from the glued databases.

Ok, this makes sense, as the numbers it was reporting definitely seemed 
smaller than I was expecting.  I bet they are for pieces in the rootDSE.

--Quanah

--

Quanah Gibson-Mount
Principal Software Engineer
Zimbra, Inc
--------------------
Zimbra ::  the leader in open source messaging and collaboration

Comment 6 ando@openldap.org 2009-08-06 19:40:08 UTC
moved from Incoming to Software Enhancements
Comment 7 Quanah Gibson-Mount 2009-08-13 17:35:10 UTC
--On Thursday, August 06, 2009 7:25 PM +0200 masarati@aero.polimi.it wrote:

>>
>>
>> --On August 5, 2009 7:17:43 PM +0000 masarati@aero.polimi.it wrote:
>>
>>>> This may be specific to glued databases (databases rooted at "").
>>>
>>> The problem could be partially addressed by telling back-bdb (who's
>>> maintaining this data in the monitor backend) to check subordinates as
>>> soon as it discovers it's a glue instance.  However, this poses two
>>> different problems:
>>>
>>> - is the aggregate information resulting from adding all glued databases
>>> cache usage still useful?
>>>
>>> - what happens if heterogeneous databases are glued?  Significantly,
>>> what
>>> if the superior database is not bdb/hdb?
>>>
>>> Probably, the monitor database should also present subordinate databases
>>> as separate entries.
>>
>> Interestingly, in my case, there's only one real database in play for the
>> glue as it is, since it's rooted at "".  All other database definitions
>> come before it (cn=config, cn=accesslog, cn=monitor).
>
> but... are you actually gluing something?  In any case, this is not
> specific to the case of empty suffix in the glue database.  I could easily
> reproduce it with a "normal" glued setup, and I was about to start fixing
> things when the two above questions came to my mind.
>
>> I'm also curious why back-monitor develops stats for the other caches but
>> specifically not for entry cache.
>
> I haven't looked in detail, but it makes sense that some operation occurs
> within the glue database which requires caching something, but not
> entries.
>
> - The entry cache monitor shows the value of bdb->bi_cache.c_cursize;
>
> - the DN cache monitor shows the value of bdb->bi_cache.c_eiused, which
> should be the number of entryinfo structures used;
>
> - the IDL cache monitor shows the value of bdb->bi_idl_cache_size.
>
> In my very simple tests, I only saw something populating the DN cache,
> which means some internal operation required to allocate some entryinfo
> structures that remain 'round.
>
> In any case, it's only showing information related to its database
> structure, it is by no means collecting info from the glued databases.

Any chance we can get this fixed for 2.4.18 for the real backends?  I think 
making it so you can see the counters per-real backend should definitely be 
in place.  How to handle the frontend is interesting and definitely should 
be addressed as well but isn't (to me at least) quite as crucial at this 
moment. ;)

--Quanah



--

Quanah Gibson-Mount
Principal Software Engineer
Zimbra, Inc
--------------------
Zimbra ::  the leader in open source messaging and collaboration

Comment 8 ando@openldap.org 2009-08-13 17:39:50 UTC
> Any chance we can get this fixed for 2.4.18 for the real backends?  I
> think
> making it so you can see the counters per-real backend should definitely
> be
> in place.  How to handle the frontend is interesting and definitely should
> be addressed as well but isn't (to me at least) quite as crucial at this
> moment. ;)

A quick hack consists in removing the test for SLAP_GLUE_SUBORDINATE() in
monitor_subsys_database_init().  This will expose subordinate databases
and thus should allow back-bdb and back-hdb to hook their cache stats
stuff.  Probably this is the best approach.  I would complement it by
adding a olmIsSubordinate attribute with TRUE value, and/or a
olmSuperiorSuffix containing the suffix of the superior.

p.

Comment 9 ando@openldap.org 2009-08-13 18:56:18 UTC
> A quick hack consists in removing the test for SLAP_GLUE_SUBORDINATE() in
> monitor_subsys_database_init().  This will expose subordinate databases
> and thus should allow back-bdb and back-hdb to hook their cache stats
> stuff.  Probably this is the best approach.  I would complement it by
> adding a olmIsSubordinate attribute with TRUE value, and/or a
> olmSuperiorSuffix containing the suffix of the superior.

An even better approach would be to append subordinate databases below the
superior one...  but I'd leave this for 2.5 :)

p.

Comment 10 ando@openldap.org 2009-08-13 19:10:23 UTC
changed notes
moved from Software Enhancements to Development
Comment 11 Quanah Gibson-Mount 2009-08-13 19:52:13 UTC
--On Thursday, August 13, 2009 8:56 PM +0200 masarati@aero.polimi.it wrote:

>
>> A quick hack consists in removing the test for SLAP_GLUE_SUBORDINATE() in
>> monitor_subsys_database_init().  This will expose subordinate databases
>> and thus should allow back-bdb and back-hdb to hook their cache stats
>> stuff.  Probably this is the best approach.  I would complement it by
>> adding a olmIsSubordinate attribute with TRUE value, and/or a
>> olmSuperiorSuffix containing the suffix of the superior.
>
> An even better approach would be to append subordinate databases below the
> superior one...  but I'd leave this for 2.5 :)

Thanks for handling this. :)

--Quanah

--

Quanah Gibson-Mount
Principal Software Engineer
Zimbra, Inc
--------------------
Zimbra ::  the leader in open source messaging and collaboration

Comment 12 ando@openldap.org 2009-08-13 23:50:32 UTC
changed state Open to Partial
Comment 13 Quanah Gibson-Mount 2009-08-14 20:57:08 UTC
changed notes
Comment 14 OpenLDAP project 2014-08-01 21:05:00 UTC
addressed in HEAD; needs work
addressed in RE24