Issue 7735 - Memory leaks on Multi-Master replications
Summary: Memory leaks on Multi-Master replications
Status: VERIFIED FIXED
Alias: None
Product: OpenLDAP
Classification: Unclassified
Component: slapd (show other issues)
Version: 2.4.37
Hardware: All All
: --- normal
Target Milestone: ---
Assignee: OpenLDAP project
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-10-29 10:45 UTC by dmitrii.fonariuk@gmail.com
Modified: 2014-08-01 21:04 UTC (History)
0 users

See Also:


Attachments
valg.txt.zip (49.79 KB, application/zip)
2013-11-01 06:00 UTC, dmitrii.fonariuk@gmail.com
Details
MMR.zip (137.59 KB, application/zip)
2013-10-31 08:33 UTC, dmitrii.fonariuk@gmail.com
Details

Note You need to log in before you can comment on or make changes to this issue.
Description dmitrii.fonariuk@gmail.com 2013-10-29 10:45:52 UTC
Full_Name: Dmitrii Fonariuk
Version: 2.4.37
OS: RHEL6.x86_64
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (91.210.4.1)


Hi.
I have 2 servers with OpenLDAP 2.4.37 in Multi-Master replication.
If I send modify request to only one server no memory leaks. If I send different
modify request on both servers I see a memory leaks on both servers.

Here are the settings of one of the servers:
dn: olcDatabase={1}mdb,cn=config
objectClass: olcDatabaseConfig
objectClass: olcConfig
objectClass: top
objectClass: olcMdbConfig
olcDatabase: {1}mdb
olcDbDirectory: /opt/local/OpenLDAP/data/mdb3
olcDbEnvFlags: writemap
olcDbEnvFlags: mapasync
olcDbIndex: entryUUID eq
olcDbIndex: entryCSN eq
olcDbIndex: objectClass eq
olcDbMaxSize: 5368709120
olcLimits: {0}dn.exact="uid=replica1,uid=admin,dc=rrr" time.soft=unlimited
time.hard=unlimited size.soft=unlimited size.hard=unlimited
olcMirrorMode: TRUE
olcRootDN: dc=rrr
olcRootPW: secret
olcSuffix: dc=rrr
olcSyncrepl: {0}rid=010 provider=ldap://srv-rrr03:3903
binddn="uid=replica1,uid=admin,dc=rrr" bindmethod=simple credentials=replica1
searchbase="dc=rrr" schemachecking=off type=refreshAndPersist retry="5 5 60 +"

dn: olcOverlay={0}syncprov,olcDatabase={1}mdb,cn=config
objectClass: olcSyncProvConfig
objectClass: olcOverlayConfig
objectClass: olcConfig
objectClass: top
olcOverlay: {0}syncprov
olcSpCheckpoint: 1000 5

Requests on one server:
dn: id=1000000000[10000-30000],ou=subscriber,o=Megatron,dc=rrr
changetype: modify
replace: state
state: [1-9]

no memory leaks

Requests on both servers:
1:

dn: id=1000000000[10000-20000],ou=subscriber,o=Megatron,dc=rrr
changetype: modify
replace: state
state: [1-9]

2:

dn: id=1000000000[20001-30000],ou=subscriber,o=Megatron,dc=rrr
changetype: modify
replace: state
state: [1-9]

memory leaks

* in square brackets ranges.

Ranges for DN do not overlap for different servers. State is numericString
attribute.
Comment 1 Quanah Gibson-Mount 2013-10-29 15:27:45 UTC
--On Tuesday, October 29, 2013 10:45 AM +0000 dmitrii.fonariuk@gmail.com 
wrote:

> Full_Name: Dmitrii Fonariuk
> Version: 2.4.37
> OS: RHEL6.x86_64
> URL: ftp://ftp.openldap.org/incoming/
> Submission from: (NULL) (91.210.4.1)
>> memory leaks
>
> * in square brackets ranges.
>
> Ranges for DN do not overlap for different servers. State is numericString
> attribute.

How are you determining there is a memory leak?  Are you using an 
alternative memory allocator like tcmalloc instead of glibc?

--Quanah


--

Quanah Gibson-Mount
Architect - Server
Zimbra, Inc.
--------------------
Zimbra ::  the leader in open source messaging and collaboration

Comment 2 dmitrii.fonariuk@gmail.com 2013-10-30 05:32:26 UTC
hi.
visually. two days of use 3K load modify per second memory consumption
increased from 200 Mb to 16 Gb. After that, the system killed the process.
when modify send only to one server use 200 Mb memory and its consumption
does not increase.
no I do not use an alternative memory allocator.

best regards,
Dmitrii


2013/10/29 Quanah Gibson-Mount <quanah@zimbra.com>

> --On Tuesday, October 29, 2013 10:45 AM +0000 dmitrii.fonariuk@gmail.comwrote:
>
>  Full_Name: Dmitrii Fonariuk
>> Version: 2.4.37
>> OS: RHEL6.x86_64
>> URL: ftp://ftp.openldap.org/**incoming/<ftp://ftp.openldap.org/incoming/>
>> Submission from: (NULL) (91.210.4.1)
>>
>>> memory leaks
>>>
>>
>> * in square brackets ranges.
>>
>> Ranges for DN do not overlap for different servers. State is numericString
>> attribute.
>>
>
> How are you determining there is a memory leak?  Are you using an
> alternative memory allocator like tcmalloc instead of glibc?
>
> --Quanah
>
>
> --
>
> Quanah Gibson-Mount
> Architect - Server
> Zimbra, Inc.
> --------------------
> Zimbra ::  the leader in open source messaging and collaboration
>
Comment 3 Quanah Gibson-Mount 2013-10-30 05:45:01 UTC

> On Oct 29, 2013, at 10:32 PM, Dmitrii Fonariuk <dmitrii.fonariuk@gmail.com> wrote:
> 
> hi.
> visually. two days of use 3K load modify per second memory consumption increased from 200 Mb to 16 Gb. After that, the system killed the process.
> when modify send only to one server use 200 Mb memory and its consumption does not increase.
> no I do not use an alternative memory allocator.
> 
> best regards,
> Dmitrii

I strongly advise you use tcmalloc and report back. 

--Quanah



> 
> 2013/10/29 Quanah Gibson-Mount <quanah@zimbra.com>
>> --On Tuesday, October 29, 2013 10:45 AM +0000 dmitrii.fonariuk@gmail.com wrote:
>> 
>>> Full_Name: Dmitrii Fonariuk
>>> Version: 2.4.37
>>> OS: RHEL6.x86_64
>>> URL: ftp://ftp.openldap.org/incoming/
>>> Submission from: (NULL) (91.210.4.1)
>>>> memory leaks
>>> 
>>> * in square brackets ranges.
>>> 
>>> Ranges for DN do not overlap for different servers. State is numericString
>>> attribute.
>> 
>> How are you determining there is a memory leak?  Are you using an alternative memory allocator like tcmalloc instead of glibc?
>> 
>> --Quanah
>> 
>> 
>> --
>> 
>> Quanah Gibson-Mount
>> Architect - Server
>> Zimbra, Inc.
>> --------------------
>> Zimbra ::  the leader in open source messaging and collaboration
> 
Comment 4 dmitrii.fonariuk@gmail.com 2013-10-30 07:48:38 UTC
use libtcmalloc_minimal.so.4.1.2
add export LD_PRELOAD=/opt/local/tcmalloc/lib/libtcmalloc_minimal.so to my
run script
run slapd and view pmap pid to make sure that tcmalloc are using
run load script - 3K mps on two servers (1.5K mps per server)
load for 30 minutes memory consumption increased from 366 Mb to 660 Mb and
rising.

libtcmalloc did not help me



2013/10/30 Quanah Gibson-Mount <quanah@zimbra.com>

>
>
> On Oct 29, 2013, at 10:32 PM, Dmitrii Fonariuk <dmitrii.fonariuk@gmail.com>
> wrote:
>
> hi.
> visually. two days of use 3K load modify per second memory consumption
> increased from 200 Mb to 16 Gb. After that, the system killed the process.
> when modify send only to one server use 200 Mb memory and its consumption
> does not increase.
> no I do not use an alternative memory allocator.
>
> best regards,
> Dmitrii
>
>
> I strongly advise you use tcmalloc and report back.
>
>
>
Comment 5 Quanah Gibson-Mount 2013-10-30 16:25:28 UTC
--On Wednesday, October 30, 2013 11:48 AM +0400 Dmitrii Fonariuk 
<dmitrii.fonariuk@gmail.com> wrote:

>
> use libtcmalloc_minimal.so.4.1.2
> add export LD_PRELOAD=/opt/local/tcmalloc/lib/libtcmalloc_minimal.so to
> my run script
> run slapd and view pmap pid to make sure that tcmalloc are using
> run load script - 3K mps on two servers (1.5K mps per server)
> load for 30 minutes memory consumption increased from 366 Mb to 660 Mb
> and rising.

Ok.  You will need to provide your full configuration for each server, 
minus passwords, as well as sample data for this to be pursued.

You may also want to run slapd under oprofile (without tcmalloc) to show 
the existence of any leaks.

--Quanah

--

Quanah Gibson-Mount
Architect - Server
Zimbra, Inc.
--------------------
Zimbra ::  the leader in open source messaging and collaboration

Comment 6 dmitrii.fonariuk@gmail.com 2013-10-31 08:33:49 UTC
Hi.

I am sending full configuration, data examples, oprofile reports, pmap and
valgrind report (may be useful).
if you need some more info let me know.

--Dmitrii



> Ok.  You will need to provide your full configuration for each server,
> minus passwords, as well as sample data for this to be pursued.
>
> You may also want to run slapd under oprofile (without tcmalloc) to show
> the existence of any leaks.
>
>
> --Quanah
>
> --
>
> Quanah Gibson-Mount
> Architect - Server
> Zimbra, Inc.
> --------------------
> Zimbra ::  the leader in open source messaging and collaboration
>
Comment 7 Quanah Gibson-Mount 2013-11-01 01:00:53 UTC
--On Thursday, October 31, 2013 12:33 PM +0400 Dmitrii Fonariuk 
<dmitrii.fonariuk@gmail.com> wrote:

>
>
> Hi.
>
>
> I am sending full configuration, data examples, oprofile reports, pmap
> and valgrind report (may be useful).
> if you need some more info let me know.

Hi Dmitrii,

Your slapd configs have a serious problems.  You have syncrepl configured 
on cn=config, but the server1 and server2 config.ldifs both have different 
mdb configs but with the same cn=config syncrepl config,  which means, when 
they're actually running, one or the other is going to overwrite the 
other's mdb olcdatabase entry.

Also, your slapd is stripped.  Please make sure to compile with debugging 
symbols (-g), no optimization (-O0), and when you do make install, do make 
install STRIP="" so the resulting binaries are not stripped.

After removing syncrepl from cn=config, if you can still reproduce a memory 
leak, then please provide the valgrind trace with the unstripped slapd with 
debugging symbols.
Please provide the script(s) you are using as well.

Thanks!

--Quanah

--

Quanah Gibson-Mount
Architect - Server
Zimbra, Inc.
--------------------
Zimbra ::  the leader in open source messaging and collaboration

Comment 8 dmitrii.fonariuk@gmail.com 2013-11-01 06:00:00 UTC
Hi Quanah.

I have syncrepl configured on cn=config only for replication cn=schema
(searchbase="cn=schema,cn=config") not for all cn=config.
ok. i has remove syncrepl from cn=config, but nothing has changed.
I reinstall slapd and now binaries are not stripped. Send to you new
valgrind report.

--Dmitrii
Comment 9 Quanah Gibson-Mount 2013-11-01 16:42:15 UTC
--On Friday, November 01, 2013 10:00 AM +0400 Dmitrii Fonariuk 
<dmitrii.fonariuk@gmail.com> wrote:

>
>
> Hi Quanah.
>
>
> I have syncrepl configured on cn=config only for replication cn=schema
> (searchbase="cn=schema,cn=config") not for all cn=config.
> ok. i has remove syncrepl from cn=config, but nothing has changed.
> I reinstall slapd and now binaries are not stripped. Send to you new
> valgrind report.

Thanks.  However, you still failed to provide the script you are running, 
which is needed to see if we can reproduce the issue locally.

The valgrind output shows a leak, but the source code disagrees, in that we 
never alloc that variable without freeing its old contents first and 
there's only one instance of the variable. At worst it could be a onetime 
leak, valgrind is claiming multiple occurrences.

So we really need to be able to try and reproduce your issue.

Regards,
Quanah

--

Quanah Gibson-Mount
Architect - Server
Zimbra, Inc.
--------------------
Zimbra ::  the leader in open source messaging and collaboration

Comment 10 Quanah Gibson-Mount 2013-11-01 18:14:19 UTC
--On Friday, November 01, 2013 4:43 PM +0000 quanah@zimbra.com wrote:

> So we really need to be able to try and reproduce your issue.

Hi Dmitrii,

After further examination, we believe the root cause of the issue has been 
identified.  Please rebuild your source with this patch:

<http://www.openldap.org/devel/gitweb.cgi?p=openldap.git;a=patch;h=0645878d5dc4c08b7fd8d5bf39a166d74650f15b>

and report back if it resolves the memory leak.

Thanks!

--Quanah


--

Quanah Gibson-Mount
Architect - Server
Zimbra, Inc.
--------------------
Zimbra ::  the leader in open source messaging and collaboration

Comment 11 dmitrii.fonariuk@gmail.com 2013-11-01 18:39:55 UTC
hi Quanah,

it's a good news! unfortunately I can't test the changes now. tuesday I
rebuild the source with patch, do the tests and report to you.
thank you for your efforts.

-- Dmitrii
Comment 12 Quanah Gibson-Mount 2013-11-03 01:07:20 UTC
--On Friday, November 01, 2013 10:39 PM +0400 Dmitrii Fonariuk 
<dmitrii.fonariuk@gmail.com> wrote:

>
>
> hi Quanah,
>
> it's a good news! unfortunately I can't test the changes now. Tuesday I
> rebuild the source with patch, do the tests and report to you.
>
> thank you for your efforts.

You will also need to grab 
<http://www.openldap.org/devel/gitweb.cgi?p=openldap.git;a=patch;h=ed092229634baf8c5a9f554ea664a2468d6f5a6a>

--Quanah


--

Quanah Gibson-Mount
Architect - Server
Zimbra, Inc.
--------------------
Zimbra ::  the leader in open source messaging and collaboration

Comment 13 Howard Chu 2013-11-05 07:01:48 UTC
changed notes
changed state Open to Test
moved from Incoming to Software Bugs
Comment 14 dmitrii.fonariuk@gmail.com 2013-11-05 07:12:10 UTC
Hi Quanah,


I have installed two patches, compiled binary and run modify requests test.
I see that the memory is no longer leaking. great!!! patches helped.
excellent work!


Thanks!


--Dmitrii
Comment 15 Quanah Gibson-Mount 2013-11-05 11:19:07 UTC
changed notes
changed state Test to Release
Comment 16 Quanah Gibson-Mount 2013-11-18 10:10:47 UTC
changed notes
changed state Release to Closed
Comment 17 OpenLDAP project 2014-08-01 21:04:50 UTC
fixed in master
fixed in RE25
fixed in RE24