[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Overlays: freeing resources on a hijacked Operation



David Hawes wrote:

 I have recently finished an overlay that I call 'addpartial' that
 intercepts an add operation, determines if the entry to be added
 exists, and if so, determines if anything has changed and modifies
 the appropriate attributes. If the entry does not exist, the add
 proceeds as expected. If the entry does exist yet is identical to
 the existing entry, the add will silently return LDAP_SUCCESS. The
 intent of this overlay is to do partial record replication from a
 master to its slaves, while upstream replication data (from our
 Registry, in this case) can be full records. Before this overlay, we
 were doing deletes and adds for each entry changed in our Registry
 (ugh!), so we were effectively slowing down our directories. The
 overlay can do a steady 400-500 records compared per second on
 entries with approximately 20-40 attributes (very rough numbers). I
 know other people faced with this problem effectively diff entries at
 another level--this overlay simply does that at the LDAP level. I
 provide this information in the hope that others will find this
 useful. I'd love to polish it up and contribute it, if so.

Sounds good. This is exactly the behavior that I want to add to the syncrepl consumer, which currently does a full Delete/Add on modifications.


 My apologies for that long-winded description. Now on to my
 question.

 After the overlay has finished doing what it needs to do, it sends
 LDAP_SUCCESS to the client and then returns. If I return
 LDAP_SUCCESS, my server runs out of memory at around 1 million
 entries (starts eating up swap and then segfaults). However, if I
 return SLAP_CB_CONTINUE, the server does not run out of memory
 (tested up to 20 million entries). I assume that something in the
 original add Operation needs to be freed, but I am not sure what.
 Could anyone point me in the right direction for this case? The
 overlay works fine when I return SLAP_CB_CONTINUE, but there are some
 misleading errors left in the logs (68: entry already exists) as it
 tries to complete the original operation.

 I appreciate any help or guidance for this. Also, I have to say I
 really love the concept of overlays.

Looking at the current HEAD code, I don't see anything obvious. Since you didn't mention what version you're working with, we have no point of reference from which to start guiding you. Without knowing what your code looks like, there's no way for us to tell what is being leaked, and guessing won't be very helpful.


I suggest you run your code under a malloc debugger and let it tell you exactly what is being leaked. valgrind on linux is really good for this sort of thing. If you're using Solaris you can use my FunctionCheck 1.5.3 package. http://highlandsun.com/hyc/fncchk153.tgz

FunctionCheck works on Linux also, of course. Which to use depends on the problem you're chasing; valgrind can work with your existing binaries but since it runs your code using a CPU emulator, it is hundreds of times slower than normal execution. FunctionCheck requires you to recompile your code specially with gcc, but it runs on the real CPU so it's only marginally slower than normal. When the problem you're chasing can be reproduced quickly and easily, valgrind is probably the easier approach. When the problem occurs rarely, and requires a lot of operations to reproduce, it may be faster overall to use FunctionCheck.

-
 -- Howard Chu
 Chief Architect, Symas Corp.       Director, Highland Sun
 http://www.symas.com               http://highlandsun.com/hyc
 Symas: Premier OpenSource Development and Support