Issue 2040 - back-bdb resource leak
Summary: back-bdb resource leak
Status: VERIFIED FIXED
Alias: None
Product: OpenLDAP
Classification: Unclassified
Component: slapd (show other issues)
Version: unspecified
Hardware: All All
: --- normal
Target Milestone: ---
Assignee: OpenLDAP project
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2002-08-23 06:20 UTC by steven.wilton@team.eftel.com
Modified: 2014-08-01 21:06 UTC (History)
0 users

See Also:


Attachments
idl.c (24.75 KB, text/x-c)
2002-08-27 04:07 UTC, Howard Chu
Details

Note You need to log in before you can comment on or make changes to this issue.
Description Howard Chu 2002-08-23 01:28:20 UTC
The leaks appear to be internal to Berkeley DB, nothing we can do about it.
Upgrade to Berkeley 4.1 when it becomes available. In the meantime, my tests
indicate that the number of dangling lockers remains unchanged on repeated
queries. E.g., search for "objectclass=foo" and then follow up with a search for
"objectclass=bar" - on my system (SuSE Linux, glibc 2.0.7) the number of lockers
and locks doesn't change.

BDB itself is keeping a number of handles on each index database file, as I
mentioned before. The first time you reference an indexed attribute, back-bdb
must open the database file for that index, and this will consume a couple of
lockers. But repeated references to an already opened index shouldn't increase
the number of lockers any further, nor does it appear to here.

You should remove the "pres" index for objectclass, it's implicitly assumed that
every entry has an objectclass so that index is ignored.



Comment 1 Howard Chu 2002-08-23 01:29:24 UTC
changed notes
changed state Open to Feedback
Comment 2 steven.wilton@team.eftel.com 2002-08-23 06:20:33 UTC
Full_Name: Steven Wilton
Version: 2.1.4
OS: Debian Linux 3.0 (woody) 2.4.18 kernel
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (203.24.100.132)



I noticed that the ldap server is running out of 'locker' entries in the
back-bdb code, and have come up wit the following results:
- Every search on an indexed ldap attribute will result in 1x extra "Current
lock" being used
- Every search that includes (&(objectclass=posixAccount)(uidNumber=x)) with
objectclass indexed as pres,eq results in 5xlockers and 1xlock being used
- Every search that includes  (&(objectclass=posixAccount)(uidNumber=x)) with
objectclass indexed as eq results in 2x lockers and 1xlock being used
- Every search that includes (&(uid=y)(uidNumber=x)) (where both attributes are
indexed 'eq' will result in 2x locks being used
- Every search that includes no indexed fileds results in no lost locks or
lockers.

We are using version 4.0.14 of the berkely db library, glibc2.2.5, gcc-3.0 under
linux.  The following is the output from various searches and db_stat commands:

sv1:~# db4.0_stat -c -h /var/lib/ldap/

101 Last allocated locker ID.
9       Number of lock modes.
1000    Maximum number of locks possible.
1000    Maximum number of lockers possible.
1000    Maximum number of objects possible.
24      Current locks.
27      Maximum number of locks so far.
61      Current number of lockers.
62      Maximum number  lockers so far.
0       Current number lock objects.
6       Maximum number of lock objects so far.
306     Number of lock requests.
306     Number of lock releases.
0       Number of lock requests that would have waited.
0       Number of lock conflicts.
0       Number of deadlocks.
0       Number of transaction timeouts.
0       Number of lock timeouts.
352KB   Lock region size (360448 bytes).
0       The number of region locks granted after waiting.
945     The number of region locks granted without waiting.

sv1:~# ldapsearch -b o=EFTEL 
'(&(objectclass=posixAccount)(uidNumber=20678))'

# extended LDIF
#
# LDAPv3
# filter: (&(objectclass=posixAccount)(uidNumber=20678))
# requesting: ALL
#

# prolfe, People, q-net, net, au, EFTEL
dn: uid=prolfe,...,o=EFTEL
objectClass: posixAccount
...
uid: prolfe
...

# search result
search: 2
result: 0 Success

# numResponses: 2
# numEntries: 1

sv1:~# db4.0_stat -c -h /var/lib/ldap/

107 Last allocated locker ID.
9       Number of lock modes.
1000    Maximum number of locks possible.
1000    Maximum number of lockers possible.
1000    Maximum number of objects possible.
25      Current locks.              <----- +1
28      Maximum number of locks so far.
66      Current number of lockers.  <----- +5
67      Maximum number  lockers so far.
0       Current number lock objects.
6       Maximum number of lock objects so far.
318     Number of lock requests.
318     Number of lock releases.
0       Number of lock requests that would have waited.
0       Number of lock conflicts.
0       Number of deadlocks.
0       Number of transaction timeouts.
0       Number of lock timeouts.
352KB   Lock region size (360448 bytes).
0       The number of region locks granted after waiting.
979     The number of region locks granted without waiting.

sv1:~# ldapsearch -b o=EFTEL uid=swilton

# extended LDIF
#
# LDAPv3
# filter: uid=swilton
# requesting: ALL
#

# swilton, People, vision, net, au, EFTEL
dn: uid=swilton,...,o=EFTEL
objectClass: posixAccount
...
uid: swilton
...

# swilton, People, q-net, net, au, EFTEL
dn: uid=swilton,...,o=EFTEL
objectClass: posixAccount
...
uid: swilton
...

# swilton, People, eftel, net, au, EFTEL
dn: uid=swilton,...,o=EFTEL
objectClass: posixAccount
...
uid: swilton
...

# search result
search: 2
result: 0 Success

# numResponses: 4
# numEntries: 3

sv1:~# db4.0_stat -c -h /var/lib/ldap/

108 Last allocated locker ID.
9       Number of lock modes.
1000    Maximum number of locks possible.
1000    Maximum number of lockers possible.
1000    Maximum number of objects possible.
26      Current locks.              <---- +1
29      Maximum number of locks so far.
66      Current number of lockers.  <---- unchanged
67      Maximum number  lockers so far.
0       Current number lock objects.
6       Maximum number of lock objects so far.
327     Number of lock requests.
327     Number of lock releases.
0       Number of lock requests that would have waited.
0       Number of lock conflicts.
0       Number of deadlocks.
0       Number of transaction timeouts.
0       Number of lock timeouts.
352KB   Lock region size (360448 bytes).
0       The number of region locks granted after waiting.
1002    The number of region locks granted without waiting.


And on an unindexed field...

sv1:~# db4.0_stat -c -h /var/lib/ldap/
109 Last allocated locker ID.
9       Number of lock modes.
1000    Maximum number of locks possible.
1000    Maximum number of lockers possible.
1000    Maximum number of objects possible.
27      Current locks.
30      Maximum number of locks so far.
66      Current number of lockers.
67      Maximum number  lockers so far.
0       Current number lock objects.
6       Maximum number of lock objects so far.
334     Number of lock requests.
334     Number of lock releases.
0       Number of lock requests that would have waited.
0       Number of lock conflicts.
0       Number of deadlocks.
0       Number of transaction timeouts.
0       Number of lock timeouts.
352KB   Lock region size (360448 bytes).
0       The number of region locks granted after waiting.
1021    The number of region locks granted without waiting.

sv1:~# ldapsearch -b o=EFTEL mail=prolfe@q-net.net.au

# extended LDIF
#
# LDAPv3
# filter: mail=prolfe@q-net.net.au
# requesting: ALL
#

# prolfe, People, q-net, net, au, EFTEL
dn: uid=prolfe,...,dc=au,o=EFTEL
...
mail: prolfe@q-net.net.au
...

# search result
search: 2
result: 0 Success

# numResponses: 2
# numEntries: 1

sv1:~# db4.0_stat -c -h /var/lib/ldap/

110 Last allocated locker ID.
9       Number of lock modes.
1000    Maximum number of locks possible.
1000    Maximum number of lockers possible.
1000    Maximum number of objects possible.
27      Current locks.               <--- unchanged
30      Maximum number of locks so far.
66      Current number of lockers.   <--- unchanged
67      Maximum number  lockers so far.
0       Current number lock objects.
6       Maximum number of lock objects so far.
68181   Number of lock requests.
68181   Number of lock releases.
0       Number of lock requests that would have waited.
0       Number of lock conflicts.
0       Number of deadlocks.
0       Number of transaction timeouts.
0       Number of lock timeouts.
352KB   Lock region size (360448 bytes).
0       The number of region locks granted after waiting.
114101  The number of region locks granted without waiting.



Indexes on the above queries are as follows:

index   objectClass             pres,eq
index   cn,sn,uid               eq
index   uidNumber,gidNumber,memberUid   eq


during the above test queries, no other queries were sent to the ldap server
from any other program.

Comment 3 Howard Chu 2002-08-27 04:07:03 UTC
-----Original Message-----
From: Steven Wilton [mailto:steven.wilton@team.eftel.com]

Howard,

I did find the source of the locker leak using the attached copy of idl.c
coming up
with the following output.  You will notice that the leak occurs when "rc
= cursor->c_get( cursor, key, &data, flags | DB_NEXT_DUP );"  is run more
than once in the loop.

(&(uid=swilton2)(objectclass=posixAccount)) (no leak)

=> bdb_equality_candidates
=> key_read
=> bdb_key_read - lockers1 19
=> bdb_key_read - lockers2 19
=> bdb_key_read - lockers3 19
=> bdb_key_read - lockers3.1 19
=> bdb_key_read - lockers3.2 19
=> bdb_key_read - lockers4 19
=> bdb_key_read - lockers5 19
<= bdb_index_read 1 candidates
<= bdb_equality_candidates id=1, first=8860, last=8860
=> bdb_equality_candidates
=> key_read
=> bdb_key_read - lockers1 19
=> bdb_key_read - lockers2 19
=> bdb_key_read - lockers3 19
=> bdb_key_read - lockers3.1 19
=> bdb_key_read - lockers3.2 19
=> bdb_key_read - lockers3.1 19
=> bdb_key_read - lockers3.2 20
=> bdb_key_read - lockers3.1 20
=> bdb_key_read - lockers3.2 21
=> bdb_key_read - lockers3.1 21
=> bdb_key_read - lockers3.2 22
=> bdb_key_read - lockers3.1 22
=> bdb_key_read - lockers3.2 23
=> bdb_key_read - lockers3.1 23
=> bdb_key_read - lockers3.2 24
=> bdb_key_read - lockers4 24
=> bdb_key_read - lockers5 24
<= bdb_index_read 22467 candidates
<= bdb_equality_candidates id=1, first=8860, last=8860


    (&(gidNumber=200)(uidNumber=14396)) (leak)

=> bdb_equality_candidates
=> key_read
=> bdb_key_read - lockers1 24
=> bdb_key_read - lockers2 24
=> bdb_key_read - lockers3 24
=> bdb_key_read - lockers3.1 24
=> bdb_key_read - lockers3.2 24
=> bdb_key_read - lockers4 24
=> bdb_key_read - lockers5 24
<= bdb_index_read 20 candidates
<= bdb_equality_candidates id=20, first=5524, last=14848
=> bdb_equality_candidates
=> key_read
=> bdb_key_read - lockers1 24
=> bdb_key_read - lockers2 24
=> bdb_key_read - lockers3 24
=> bdb_key_read - lockers3.1 24
=> bdb_key_read - lockers3.2 24
=> bdb_key_read - lockers4 24
=> bdb_key_read - lockers5 24
<= bdb_index_read 1 candidates
<= bdb_equality_candidates id=1, first=8860, last=8860

Steven
Comment 4 Howard Chu 2002-08-27 05:08:23 UTC
I have reproduced this bug using both BDB 4.0.14 and the pre-release of BDB 4.1,
using a database with 10,000 "person" entries. I'll send an updated notice to
support@sleeypcat.com. Looks like we need to commit the workaround into our
source tree.
Comment 5 Howard Chu 2002-08-27 06:27:01 UTC
changed notes
changed state Feedback to Test
moved from Incoming to Software Bugs
Comment 6 Howard Chu 2002-08-27 13:25:17 UTC
The SleepyCat support request ID is #6520.

I did some more testing to try to find the relevant size limits. I created an
objectclass slot with 65332 entries. (A bit shy of the 65535 limit, but close
enough for this test.) With a 1MB buffer we get all 65332 entries in one
shot. I then tried a 512K buffer, assuming it would take two reads. Instead,
I got 32729 in the first read, 32602 in the next read, and 1 in the 3rd read.
With the trailing get() to make sure we got everything, this leaked 3 lockers
per search attempt. I also tried a 768K buffer, which read 49005 entries in
the first attempt, the remainder in the second attempt, and leaked 1 locker
on the last get() that returns DB_NOTFOUND. Since the number of items
returned doesn't appear to be linear with the buffer size, I committed a fix
that uses 1.25MB for the buffer, which should be able to hold the 65535 items
plus any additional overhead.

  -- Howard Chu
  Chief Architect, Symas Corp.       Director, Highland Sun
  http://www.symas.com               http://highlandsun.com/hyc
  Symas: Premier OpenSource Development and Support

Comment 7 Howard Chu 2002-08-27 18:21:12 UTC
I have tested the suggested patch and it fixes the resource leak. Here is the
diff against db-4.0.14; a corresponding change also works for the 4.1
pre-release.

--- db_cam.c	2002/08/27 17:49:34	1.1
+++ db_cam.c	2002/08/27 17:51:42
@@ -803,6 +803,9 @@
 		 * opd if we went to another key.
 		 */
 		if (opd != NULL) {
+			if (cp_n->opd != NULL &&
+			    (ret = opd->c_close(cp_n->opd)) != 0)
+			    	goto err;
 			cp_n->opd = opd;
 			opd = NULL;
 		}


  -- Howard Chu
  Chief Architect, Symas Corp.       Director, Highland Sun
  http://www.symas.com               http://highlandsun.com/hyc
  Symas: Premier OpenSource Development and Support

-----Original Message-----
From: Keith Bostic [mailto:bostic@sleepycat.com]
Sent: Tuesday, August 27, 2002 8:24 AM
To: hyc@highlandsun.com
Cc: kurt@openldap.org; support@sleepycat.com
Subject: Re: BDB 4.0.14 locker leak [#6520]


Hi, my name is Keith Bostic and I'm with Sleepycat Software.
I'll own your Support Request for now.

> We have a DB_HASH database with sorted duplicates. When retrieving key/data
> using a cursor and DB_MULTIPLE, a locker ID is leaked if we have to loop
thru
> the retrieval process (because our passed in buffer is too small to contain
> all the data items in one pass).

I agree with you that this is a bug in Berkeley DB.  I have a
workaround to give you, but I'm not confident it's the correct
fix.  I will let you know as soon as the we know what change
we will be making to our source tree, and that change has been
reviewed and tested.

Thanks for finding this one!

Regards,
--keith

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Keith Bostic
Sleepycat Software Inc.		bostic@sleepycat.com
118 Tower Rd.			+1-781-259-3139
Lincoln, MA 01773		http://www.sleepycat.com



Index: db_cam.c
===================================================================
RCS file: /b/CVSROOT/db/db/db_cam.c,v
retrieving revision 11.113
diff -c -r11.113 db_cam.c
*** db_cam.c	2002/08/27 15:19:53	11.113
--- db_cam.c	2002/08/27 15:20:29
***************
*** 836,841 ****
--- 836,844 ----
  		 * key.
  		 */
  		if (opd != NULL) {
+ 			if (cp_n->opd != NULL &&
+ 			    (ret = opd->c_close(cp_n->opd)) != 0)
+ 				goto err;
  			cp_n->opd = opd;
  			opd = NULL;
  		}
***************
*** 866,872 ****
  		if ((t_ret = __db_c_cleanup(
  		    dbc_arg->internal->opd, opd, ret)) != 0 && ret == 0)
  			ret = t_ret;
-
  	}

  	if ((t_ret = __db_c_cleanup(dbc_arg, dbc_n, ret)) != 0 && ret == 0)
--- 869,874 ----

Comment 8 Howard Chu 2002-09-06 01:42:50 UTC
Here's the final fix from SleepyCat:

-----Original Message-----
From: Keith Bostic [mailto:bostic@sleepycat.com]
Sent: Tuesday, September 03, 2002 10:05 AM
To: hyc@highlandsun.com
Cc: kurt@openldap.org; support@sleepycat.com
Subject: Re: BDB 4.0.14 locker leak [#6520]


Hi, my name is Keith Bostic and I'm with Sleepycat Software.

I've appended the patch we eventually decided to use to resolve
this problem, which is different from the one that I originally
sent you.

Again, thank you for finding this one!

Regards,
--keith

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Keith Bostic
Sleepycat Software Inc.		bostic@sleepycat.com
118 Tower Rd.			+1-781-259-3139
Lincoln, MA 01773		http://www.sleepycat.com




*** db/db_cam.c.orig	2002/08/27 15:19:53	11.113
--- db/db_cam.c	2002/09/03 15:44:46	11.114
***************
*** 813,822 ****
  		 */
  		if (dbc_n == NULL) {
  			/*
! 			 * Non-"_KEY" DB_MULTIPLE doesn't move the cursor,
! 			 * so it's safe to just use dbc_arg.
  			 */
! 			if (!(multi & DB_MULTIPLE_KEY) ||
  			    F_ISSET(dbc_arg, DBC_TRANSIENT))
  				dbc_n = dbc_arg;
  			else {
--- 813,825 ----
  		 */
  		if (dbc_n == NULL) {
  			/*
! 			 * Non-"_KEY" DB_MULTIPLE doesn't move the main cursor,
! 			 * so it's safe to just use dbc_arg, unless dbc_arg
! 			 * has an open OPD cursor whose state might need to
! 			 * be preserved.
  			 */
! 			if ((!(multi & DB_MULTIPLE_KEY) && 
! 			    dbc_arg->internal->opd == NULL) ||
  			    F_ISSET(dbc_arg, DBC_TRANSIENT))
  				dbc_n = dbc_arg;
  			else {
Comment 9 Howard Chu 2002-09-07 05:44:56 UTC
changed notes
changed state Test to Release
Comment 10 Kurt Zeilenga 2002-09-19 13:22:23 UTC
changed notes
changed state Release to Closed
Comment 11 Howard Chu 2006-06-11 08:51:19 UTC
moved from Software Bugs to Archive.Software Bugs
Comment 12 OpenLDAP project 2014-08-01 21:06:25 UTC
Mostly a sleepycat issue
workaround in HEAD,re21