Issue 8566 - OpenLDAP client API SASL auth memory leak
Summary: OpenLDAP client API SASL auth memory leak
Status: VERIFIED FIXED
Alias: None
Product: OpenLDAP
Classification: Unclassified
Component: slapd (show other issues)
Version: 2.4.44
Hardware: All All
: --- normal
Target Milestone: ---
Assignee: OpenLDAP project
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-01-11 21:05 UTC by william.b.clay@acm.org
Modified: 2017-01-13 09:55 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description william.b.clay@acm.org 2017-01-11 21:05:07 UTC
Full_Name: Bill Clay
Version: 2.4.44
OS: Debian/GNU Linux 7.8 (Wheezy)
URL: 
Submission from: (NULL) (79.12.44.250)


Valgrind runs of a testbed script driving an OpenLDAP API client module I am
developing appear to show a consistent, reproducible memory leak ("lost memory")
when using the SASL default authentication mech (DIGEST-MD5 for my system) over
any underlying transport: ldapi://, ldap://, ldaps://.  The first two transports
show identical symptoms with or without startTLS prior to authentication.

Additional test and symptom details:

* Using the same testbed and client module and either SASL mech EXTERNAL or LDAP
simple bind, valgrind indicates no lost memory.

* Symptoms are constant with or without proxy authz (i.e., a SASL interactive
callback SASL_CB_USER response).

* I have not configured or tested other SASL mechs in this environment.

* The FIRST SASL authentication of a process does NOT show a memory leak.

* Each SASL authentication of a process AFTER the first shows a one additional
realloc leak of the same size (500-600 bytes depending on bind details).

* The iterative test is: [ldap_initialize()], [ldap_start_tls_s()],
ldap_sasl_interactive_bind_s(), [ldap_search_ext_s()], [ldap_whoami_s()],
[ldap_unbind_ext_s()], where [] indicates calls whose omission yields no change
in symptoms (except initialize is always called for the first iteration of a
sequence and after an unbind; unbind is always called after the last
iteration).

Environment:

Debian 7 Wheezy

OpenLDAP v. 2.4.44 original (not Debian) source custom build: 
./configure --sysconfdir=/etc --localstedirir=/ \
 --disable-backends --enable-mdb --enable-monitor \
 --enable-crypt --enable-cleartext \
 --with-cyrus-sasl --enable-spasswd --enable-syslog --enable-local \
 --disable-overlays --enable-memberof --enable-refint --enable-unique \
 --disable-modules --with-tls --with-threads --with-gnu-ld

Sample valgrind output (the call stack is always the same, except for exact
addresses):

bill@fuji:/usr/local/src/liquid_feedback_frontend-v3.2.1/lib/mldap$
LD_PRELOAD=/usr/local/src/liquid_feedback_frontend-v3.2.1/lib/mldap/mldap.so
valgrind --leak-check=full /usr/local/src/altit-sso/lf-ldap/mldap_full_test.lua
==4149== Memcheck, a memory error detector
==4149== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
==4149== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
==4149== Command: /usr/local/src/altit-sso/lf-ldap/mldap_full_test.lua
==4149== 
create cctx regkey 556aeaf3c864af2e_mldap_connection, cctx=0699c028, stack 0
1	1	SASL bind uid %2p%p/w
enter mldap_bind(), regkey=556aeaf3c864af2e_mldap_connection, cctx=0x699c028
mldap_bind call ldap_initialize("ldap://fuji.pvt.suresys.com")
mldap_bind after ldap_bind(0x699cac0), cctx=0x699c028
	DIGEST-MD5,CRAM-MD5,NTLM
	dn:uid=jdoe,ou=persone,dc=altraitalia,dc=test

< snip - 8 iterations removed >

10	1	SASL bind uid + p/w
enter mldap_bind(), regkey=556aeaf3c864af2e_mldap_connection, cctx=0x699c028
mldap_bind after ldap_bind(0x699cac0), cctx=0x699c028
	DIGEST-MD5,CRAM-MD5,LMLM
	dn:uid=jdoe,ou=persone,dc=altraitalia,dc=test

unbind cctx->ldp=0x699cac0
exit unbind
normal termination, 10 iterations
enter mldap_gc(); cctx->ldp=0
exit mldap_gc()
==4149== 
==4149== HEAP SUMMARY:
==4149==     in use at exit: 45,614 bytes in 334 blocks
==4149==   total heap usage: 3,543 allocs, 3,209 frees, 169,733,045 bytes
allocated
==4149== 
==4149== 4,896 bytes in 9 blocks are definitely lost in loss record 122 of 124
==4149==    at 0x4C28CCE: realloc (vg_replace_malloc.c:632)
==4149==    by 0x7712426: _plug_buf_alloc (in
/usr/lib/x86_64-linux-gnu/sasl2/libdigestmd5.so.2.0.25)
==4149==    by 0x770C232: add_to_challenge (in
/usr/lib/x86_64-linux-gnu/sasl2/libdigestmd5.so.2.0.25)
==4149==    by 0x770E689: make_client_response (in
/r%r/lib/x86_64-linux-gnu/sasl2/libdigestmd5.so.2.0.25)
==4149==    by 0x770EC97: digestmd5_client_mech_step (in
/usr/lib/x86_64-linux-gnu/sasl2/libdigestmd5.so.2.0.25)
==4149==    by 0x5CD03AD: sasl_client_step (in
/usr/lib/x86_64-linux-gnu/libsasl2.so.2.0.25)
==4149==    by 0x5CD08DA: sasl_client_start (in
/usr/lib/x86_64-linux-gnu/libsasl2.so.2.0.25)
==4149==    by 0x405386E: ldap_int_sasl_bind (cyrus.c:510)
==4149==    by 0x4056E5F: ldap_sasl_interactive_bind (sasl.c:487)
==4149==    by 0x405702B: ldap_sasl_interactive_bind_s (sasl.c:521)
==4149==    by 0x4027663: mldap_bind (mldap.c:647)
==4149==    by 0x408C31: luaD_precall (in /usr/bin/lua5.2)
==4149== 
==4149== LEAK SUMMARY:
==4149==    definitely lost: 4,896 bytes in 9 blocks
==4149==    indirectly lost: 0 bytes in 0 blocks
==4149==      possibly lost: 0 bytes in 0 blocks
==4149==    still reachable: 40,718 bytes in 325 blocks
==4149==         suppressed: 0 bytes in 0 blocks
==4149== Reachable blocks (those to which a pointer was found) are not shown.D%D
==4149== To see them, rerun with: --leak-check=full --show-reachable=yes
==4149== 
==4149== For counts of detected and suppressed errors, rerun with: -v
==4149== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 31 from 7)
bill@fuji:/usr/local/src/liquid_feedback_frontend-v3.2.1/lib/mldap$ 
Comment 1 william.b.clay@acm.org 2017-01-12 07:38:58 UTC
The 4th and 5th bullet points under "symptom details" of my original 
ticket text are not supported by the evidence.  Corrected replacements:

* Valgrind does NOT show lost memory after a ONE SASL authentication by 
a process.

* Valgrind shows N-1 lost memory blocks after N multiple SASL 
authentications by a process. Given the same credentials and mech, each 
such block is the same size, ca. 500-600 bytes.

On reflection, it is probably the LAST authentication that frees the 
memory block in question, although my testing seems to indicate it is 
NOT unbind() that does so; perhaps some sort of process termination 
callback in OpenLDAP, SASL, or MD5 library code?

Sorry about the initial inaccuracy.

Comment 2 Howard Chu 2017-01-12 13:19:01 UTC
william.b.clay@acm.org wrote:
> Full_Name: Bill Clay
> Version: 2.4.44
> OS: Debian/GNU Linux 7.8 (Wheezy)
> URL:
> Submission from: (NULL) (79.12.44.250)

> ==4149== 4,896 bytes in 9 blocks are definitely lost in loss record 122 of 124
> ==4149==    at 0x4C28CCE: realloc (vg_replace_malloc.c:632)
> ==4149==    by 0x7712426: _plug_buf_alloc (in
> /usr/lib/x86_64-linux-gnu/sasl2/libdigestmd5.so.2.0.25)
> ==4149==    by 0x770C232: add_to_challenge (in
> /usr/lib/x86_64-linux-gnu/sasl2/libdigestmd5.so.2.0.25)
> ==4149==    by 0x770E689: make_client_response (in
> /r%r/lib/x86_64-linux-gnu/sasl2/libdigestmd5.so.2.0.25)
> ==4149==    by 0x770EC97: digestmd5_client_mech_step (in
> /usr/lib/x86_64-linux-gnu/sasl2/libdigestmd5.so.2.0.25)

This is very clearly a leak in the digestmd5.so module. It seems you should be 
reporting this to Cyrus SASL, not us.

> ==4149==    by 0x5CD03AD: sasl_client_step (in
> /usr/lib/x86_64-linux-gnu/libsasl2.so.2.0.25)
> ==4149==    by 0x5CD08DA: sasl_client_start (in
> /usr/lib/x86_64-linux-gnu/libsasl2.so.2.0.25)
> ==4149==    by 0x405386E: ldap_int_sasl_bind (cyrus.c:510)
> ==4149==    by 0x4056E5F: ldap_sasl_interactive_bind (sasl.c:487)
> ==4149==    by 0x405702B: ldap_sasl_interactive_bind_s (sasl.c:521)
> ==4149==    by 0x4027663: mldap_bind (mldap.c:647)
> ==4149==    by 0x408C31: luaD_precall (in /usr/bin/lua5.2)
> ==4149==
> ==4149== LEAK SUMMARY:
> ==4149==    definitely lost: 4,896 bytes in 9 blocks
> ==4149==    indirectly lost: 0 bytes in 0 blocks
> ==4149==      possibly lost: 0 bytes in 0 blocks
> ==4149==    still reachable: 40,718 bytes in 325 blocks
> ==4149==         suppressed: 0 bytes in 0 blocks
> ==4149== Reachable blocks (those to which a pointer was found) are not shown.D%D
> ==4149== To see them, rerun with: --leak-check=full --show-reachable=yes
> ==4149==
> ==4149== For counts of detected and suppressed errors, rerun with: -v
> ==4149== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 31 from 7)
> bill@fuji:/usr/local/src/liquid_feedback_frontend-v3.2.1/lib/mldap$
>
>


-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 3 OpenLDAP project 2017-01-12 16:04:05 UTC
Not an openldap bug
Comment 4 Quanah Gibson-Mount 2017-01-12 16:04:05 UTC
changed notes
changed state Open to Closed
Comment 5 william.b.clay@acm.org 2017-01-12 23:11:31 UTC
Greetings, SASL developers.  I recognize the version of SASL2 I'm using 
is long in tooth, but looking at the code, I believe a memory leak I've 
encountered is still present in 2.1.26 (latest source I've seen).

The problem from an OpenLDAP client viewpoint is described in detail at:

http://www.OpenLDAP.org/its/index.cgi?findid=8566

digestmd5.c sasl_client_start()/sasl_client_step(), when called for a 
new SASL DIGEST-MD5 authentication each time after the first such case, 
appear to abandon and re-allocate from scratch (without freeing) a 
[con]text->out_buf allocated and expanded during the previous 
authentication cycle by _plug_buf_alloc() on behalf of add_to_challenge().

In my case, each DIGEST-MD5 authentication after the first leaks 500-600 
bytes, regardless of whether sasl_dispose() is called between successive 
authentications.

I suspect, but have not proven, that this is because 
"text->out_buf=NULL" appears twice in digestmd5.c, in both 
make_client_response() and digestmd5_server_mech_step1().  If both 
instances were executed for one authentication cycle, it could produce 
the memory leak in question.

The latter instance (in digestmd5_server_mech_step1()) might need to 
free any block addressed by the pointer before nullifying it.  Sorry I 
can't provide a patch or stronger evidence, but the logic here is a bit 
complex for a casual onlooker to tackle.

Thanks for your efforts,
Bill Clay


Comment 6 william.b.clay@acm.org 2017-01-13 09:40:12 UTC
Per the following, reportedly fixed in cyrus-sasl versions post-2.1.25.

https://github.com/cyrusimap/cyrus-sasl/issues/370

Comment 7 Howard Chu 2017-01-13 09:55:11 UTC
william.b.clay@acm.org wrote:
> Per the following, reportedly fixed in cyrus-sasl versions post-2.1.25.
>
> https://github.com/cyrusimap/cyrus-sasl/issues/370
>
>
>
>
Great, closing this ITS.

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/