[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: (ITS#9146) incomplete initialization of sessionlog structure may cause session log to grow indefinitely
- To: openldap-its@OpenLDAP.org
- Subject: Re: (ITS#9146) incomplete initialization of sessionlog structure may cause session log to grow indefinitely
- From: hyc@symas.com
- Date: Wed, 08 Jan 2020 22:23:14 +0000
- Auto-submitted: auto-generated (OpenLDAP-ITS)
maxime.besson@worteks.com wrote:
> Full_Name: Maxime Besson
> Version: 2.4.48
> OS: Amazon Linux
> URL: ftp://ftp.openldap.org/incoming/
> Submission from: (NULL) (92.184.104.114)
Thanks for the report, fixed now in git master.
>
>
> During a cloud provider migration, I have migrated an OpenLDAP instance using
> MDB + Syncrepl (in mirror mode, with session log) from Ubuntu to Amazon Linux.
>
> When starting a server with no data, syncrepl kicks in and starts replicating
> the other server's database (12G, 7M entries).
>
> In Ubuntu this phase takes about 25 minutes (with dbnosync). In Amazon Linux,
> with the exact same cn=config and daemon options, it took 10 days on first try,
> at 100% CPU consumption the whole time.
>
> I observed that the start of the import is pretty fast (data.mdb grows by
> 10M/s) but import speed slows down rapidly to a crawl.
>
> After some investigation, I discovered that disabling the sessionlog brings the
> import times to normal.
>
> Digging further, I found that most of the CPU time is spent in
> syncprov_add_slog, and particularly this line:
>
>
> /* Keep the list in csn order. */
> ...
> for ( sep = &sl->sl_head; *sep; sep = &(*sep)->se_next ) {
>>>>>>>> if ( ber_bvcmp( &se->se_csn, &(*sep)->se_csn ) < 0 ) {
> se->se_next = *sep;
> *sep = se;
> break;
> }
> }
>
> The sessionlog appears to be growing endlessly:
>
> (gdb) print sl->sl_num
> $1 = 681896
> (gdb) print sl->sl_size
> $3 = 100
>
> Digging around the source, it seems that the session log is supposed to be
> cleaned up at the end of syncprov_add_slog, if it's not playing:
>
> if (!sl->sl_playing) {
> while ( sl->sl_num > sl->sl_size ) {
> ...
>
> However:
>
> (gdb) print sl->sl_playing
> $4 = 543388517
>
> This looked like an uninitialized value to me, and indeed, sp_cf_gen
> doesn't seem to initialize this field:
>
> if ( !sl ) {
> sl = ch_malloc( sizeof( sessionlog ));
> sl->sl_mincsn = NULL;
> sl->sl_sids = NULL;
> sl->sl_num = 0;
> sl->sl_numcsns = 0;
> sl->sl_head = sl->sl_tail = NULL;
> ldap_pvt_thread_mutex_init( &sl->sl_mutex );
> si->si_logs = sl;
> }
>
>
> I tried the following patch:
>
> --- openldap-2.4.48.orig/servers/slapd/overlays/syncprov.c 2019-07-23
> 16:46:22.000000000 +0200
> +++ openldap-2.4.48/servers/slapd/overlays/syncprov.c 2020-01-08
> 11:33:16.770110282 +0100
> @@ -3082,6 +3082,7 @@
> sl = ch_malloc( sizeof( sessionlog ));
> sl->sl_mincsn = NULL;
> sl->sl_sids = NULL;
> + sl->sl_playing = 0;
> sl->sl_num = 0;
> sl->sl_numcsns = 0;
> sl->sl_head = sl->sl_tail = NULL;
>
>
> And it got rid of the issue. My guess is that most of the time, sl_playing
> happens to be 0, but on Amazon Linux, for some reason (patched glibc?), the
> sessionlog lands in the wrong chunk of uninitilized memory
>
> Additional info:
>
> # rpm -q glibc
> glibc-2.26-32.amzn2.0.2.x86_64
>
> # uname -a
> Linux 4.14.152-127.182.amzn2.x86_64 #1 SMP Thu Nov 14 17:32:43 UTC 2019 x86_64
> x86_64 x86_64 GNU/Linux
>
> # /usr/local/openldap/libexec/slapd -V
> @(#) $OpenLDAP: slapd 2.4.48 (Jan 8 2020 11:39:38) $
>
> (LTB build)
>
>
>
>
>
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/