Issue 9287 - ldap_create cause 10s delay in some scenario
Summary: ldap_create cause 10s delay in some scenario
Status: VERIFIED FIXED
Alias: None
Product: OpenLDAP
Classification: Unclassified
Component: libraries (show other issues)
Version: 2.4.44
Hardware: All All
: --- normal
Target Milestone: 2.4.51
Assignee: OpenLDAP project
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-07-09 03:01 UTC by happy mu
Modified: 2020-08-19 16:24 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description happy mu 2020-07-09 03:01:38 UTC
hi LDAP guys

there has some issue in our product when we integrate ipv6.

scenario:  ipv4 internal + ipv6 external 
         after ldap search finish, we need build ldap message by ldap_create function. 
         some times the ldap_create message will cause 10s delay. 
  
for example: the ldap search has 4 ocs, and only the first oc's ldap_create will cause 10s delay, the fellowing ocs is within 0.0001s. 
we just think it may because the mutext is global and locked by other thread. 

could you help on this ? 



LOGs: 

there has 10s delay, and only execute ldap_create between them. 

2020/07/08-16:32:11.942114-[INFO] 13 (DBL) [DBL_LlbSearchResponse:76]LLB replied successfully with [4] entries
2020/07/08-16:32:11.942181- [INFO] 13 (DBL) [DBL_LlbSearchResponse:125]Entry Buffer[0] : l_entryBerBufferPtr="0x64820136xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
2020/07/08-16:32:21.957488 [INFO] 13 (DBL) [DBL_LlbSearchResponse:138]Successfully created LDAP dummy handle
Comment 1 Quanah Gibson-Mount 2020-07-14 15:43:21 UTC
This sounds like a DNS issue rather than a mutex issue.

If it is a mutex issue, you should be able to provide
Comment 2 Quanah Gibson-Mount 2020-07-14 15:43:38 UTC
This sounds like a DNS issue rather than a mutex issue.

If it is a mutex issue, you should be able to provide a stack trace demonstrating that case.
Comment 3 happy mu 2020-07-15 03:12:27 UTC
hi Quanah 

yes, there is an issue in ldap DNS search. 

currently we use  *result=gethostbyname_r  in function ldap_pvt_gethostbyname_a.
but gethostbyname_r is not support ipv6 and deprecated  , should use getaddrinfo() instead. 

will openldap fix this issue ?  if yes, which version can be use. 

/*********************************************************************/

ref : http://refspecs.linux-foundation.org/LSB_4.0.0/LSB-Core-generic/LSB-Core-generic/baselib-gethostbyname-r-3.html

pstack : 
1412 #0 0x00007f5e92426c3d in poll () from /lib64/libc.so.6
1413 #1 0x00007f5e8ffa4f62 in __res_context_send () from /lib64/libresolv.so.2
1414 #2 0x00007f5e8ffa2394 in __res_context_query () from /lib64/libresolv.so.2
1415 #3 0x00007f5e8ffa35ca in __res_context_search () from /lib64/libresolv.so.2
1416 #4 0x00007f5e80a13208 in gethostbyname3_context () from /lib64/libnss_dns.so.2
1417 #5 0x00007f5e80a13cdb in _nss_dns_gethostbyname_r () from /lib64/libnss_dns.so.2
1418 #6 0x00007f5e9244ce6f in gethostbyname_r@@GLIBC_2.2.5 () from /lib64/libc.so.6
1419 #7 0x00007f5e90e2667e in ldap_pvt_gethostbyname_a (name=name@entry=0x7fff152052c0 "mix-udmee-1-67784ddf44-5krt4", resbuf=resbuf@entry=0x7ff f152052a0, buf=buf@entry=0x7fff15205290, result=result@entry=0x7fff15205298, herrno_ptr=herrno_ptr@entry=0x7fff1520528c) at util-int.c:453
1420 #8 0x00007f5e90e268cd in ldap_pvt_get_fqdn (name=0x7fff152052c0 "mix-udmee-1-67784ddf44-5krt4", name@entry=0x0) at util-int.c:851
1421 #9 0x00007f5e90e24afb in ldap_int_initialize (gopts=gopts@entry=0x7f5e91049120 <ldap_int_global_options>, dbglvl=dbglvl@entry=0x7fff1520549 0) at init.c:682
1422 #10 0x00007f5e90e25841 in ldap_set_option (ld=0x0, option=20481, invalue=0x7fff15205490) at options.c:463
Comment 4 happy mu 2020-07-15 03:17:58 UTC
(In reply to Quanah Gibson-Mount from comment #2)
> This sounds like a DNS issue rather than a mutex issue.
> 
> If it is a mutex issue, you should be able to provide a stack trace
> demonstrating that case.


root cause: The gethostbyname_r() function is deprecated; applications should use getaddrinfo() instead.

we test that  it works if use getaddrinfo.
Comment 5 happy mu 2020-07-15 03:18:55 UTC
hi Quanah 

yes, there is an issue in ldap DNS search. 

currently we use  *result=gethostbyname_r  in function ldap_pvt_gethostbyname_a.
but gethostbyname_r is not support 'ipv6' and deprecated  , should use getaddrinfo() instead. 

will openldap fix this issue ?  if yes, which version can be use. 

/*********************************************************************/

ref : http://refspecs.linux-foundation.org/LSB_4.0.0/LSB-Core-generic/LSB-Core-generic/baselib-gethostbyname-r-3.html

pstack : 
1412 #0 0x00007f5e92426c3d in poll () from /lib64/libc.so.6
1413 #1 0x00007f5e8ffa4f62 in __res_context_send () from /lib64/libresolv.so.2
1414 #2 0x00007f5e8ffa2394 in __res_context_query () from /lib64/libresolv.so.2
1415 #3 0x00007f5e8ffa35ca in __res_context_search () from /lib64/libresolv.so.2
1416 #4 0x00007f5e80a13208 in gethostbyname3_context () from /lib64/libnss_dns.so.2
1417 #5 0x00007f5e80a13cdb in _nss_dns_gethostbyname_r () from /lib64/libnss_dns.so.2
1418 #6 0x00007f5e9244ce6f in gethostbyname_r@@GLIBC_2.2.5 () from /lib64/libc.so.6
1419 #7 0x00007f5e90e2667e in ldap_pvt_gethostbyname_a (name=name@entry=0x7fff152052c0 "mix-udmee-1-67784ddf44-5krt4", resbuf=resbuf@entry=0x7ff f152052a0, buf=buf@entry=0x7fff15205290, result=result@entry=0x7fff15205298, herrno_ptr=herrno_ptr@entry=0x7fff1520528c) at util-int.c:453
1420 #8 0x00007f5e90e268cd in ldap_pvt_get_fqdn (name=0x7fff152052c0 "mix-udmee-1-67784ddf44-5krt4", name@entry=0x0) at util-int.c:851
1421 #9 0x00007f5e90e24afb in ldap_int_initialize (gopts=gopts@entry=0x7f5e91049120 <ldap_int_global_options>, dbglvl=dbglvl@entry=0x7fff1520549 0) at init.c:682
1422 #10 0x00007f5e90e25841 in ldap_set_option (ld=0x0, option=20481, invalue=0x7fff15205490) at options.c:463


Thanks & Regard 
Happy 
-----Original Message-----
From: openldap-its@openldap.org <openldap-its@openldap.org> 
Sent: 2020年7月14日 23:44
To: Mu, Happy (NSB - CN/Qingdao) <happy.mu@nokia-sbell.com>
Subject: [Issue 9287] ldap_create cause 10s delay in some scenario

https://bugs.openldap.org/show_bug.cgi?id=9287

Quanah Gibson-Mount <quanah@openldap.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|---                         |FEEDBACK
Comment 6 Quanah Gibson-Mount 2020-07-15 15:22:13 UTC
(In reply to happy mu from comment #5)
> hi Quanah 
> 
> yes, there is an issue in ldap DNS search. 
> 
> currently we use  *result=gethostbyname_r  in function
> ldap_pvt_gethostbyname_a.
> but gethostbyname_r is not support 'ipv6' and deprecated  , should use
> getaddrinfo() instead. 

OpenLDAP already uses getaddrinfo if it is available.

The only function that uses gethostbyname_r is the private internal function ldap_pvt_gethostbyname_a in libldap/util-int.c

That function is only called in one location, in libldap/os-ip.c function ldap_connect_to_host, and that is if and only if getaddrinfo was not found.

I.e., there does not appear that there is anything here for OpenLDAP to do, it would appear your library is compiled in correctly.

Regards,
Quanah
Comment 7 Ryan Tandy 2020-07-15 16:58:49 UTC
(In reply to Quanah Gibson-Mount from comment #6)
> The only function that uses gethostbyname_r is the private internal function
> ldap_pvt_gethostbyname_a in libldap/util-int.c
> 
> That function is only called in one location, in libldap/os-ip.c function
> ldap_connect_to_host, and that is if and only if getaddrinfo was not found.

The stack trace that was posted shows where it is used:

> 1418 #6 0x00007f5e9244ce6f in gethostbyname_r@@GLIBC_2.2.5 () from /lib64/libc.so.6
> 1419 #7 0x00007f5e90e2667e in ldap_pvt_gethostbyname_a (name=name@entry=0x7fff152052c0 "mix-udmee-1-67784ddf44-5krt4", resbuf=resbuf@entry=0x7ff f152052a0, buf=buf@entry=0x7fff15205290, result=result@entry=0x7fff15205298, herrno_ptr=herrno_ptr@entry=0x7fff1520528c) at util-int.c:453
> 1420 #8 0x00007f5e90e268cd in ldap_pvt_get_fqdn (name=0x7fff152052c0 "mix-udmee-1-67784ddf44-5krt4", name@entry=0x0) at util-int.c:851

That is valid and true in RE24 (and I think still in master too), getaddrinfo is not used in getting our own FQDN, even if it's available.

This issue might actually be a better test case for answering my concern from 8376 since nss_ldap is not involved this time.

Where exactly is the 10 sec delay coming from, though? Even if gethostbyname doesn't support ipv6, I'd expect things to fail quickly rather than delay...
Comment 8 Quanah Gibson-Mount 2020-07-15 17:17:45 UTC
Ugh, I missed that call, thanks.

Definitely a problem then, marking for 2.4.51.
Comment 9 jessica.1.chen 2020-07-16 03:48:52 UTC
Per our configuration, NSS would firstly check /etc/hosts and then do DNS query when gethostbyname_r is invokded from ldap_set_option or ldap_create.  

In our IPv4 test bed, gethostbyname_r returned successfully with the info found in /etc/hosts. 
====================================================
NSS/HOST configuration in IPv4 test bed
====================================================
bash-4.2$ hostname
udm5g-udmee-1-788bbbb649-f667m
bash-4.2$
bash-4.2$ cat /etc/hosts |grep udm5g-udmee-1-788bbbb649-f667m
192.168.1.11    udm5g-udmee-1-788bbbb649-f667m
bash-4.2$
bash-4.2$ cat /etc/nsswitch.conf |grep host
hosts:      files dns myhostname
bash-4.2$

However in our IPv6 test bed, gethostbyname_r somehow didn't recognize the entry in /etc/hosts so it started DNS query which timed out after about 10 seconds which was expected behavior. The timer duration is controlled by the following configuration in /usr/include/resolv.h.

# define RES_TIMEOUT            5       /* min. seconds between retries */
# define RES_DFLRETRY           2       /* Default #/tries. */

With exactly the same configuration, getaddrinfo() would return successfully with info in /etc/hosts.

====================================================
NSS/HOST Configuration in IPv6 TB
====================================================
bash-4.2$ hostname
mix-udmee-1-6f869bc959-px7gx
bash-4.2$
bash-4.2$ cat /etc/hosts |grep mix-udmee
fd00::7bd1:e96f:ccda:8aa0       mix-udmee-1-6f869bc959-px7gx
bash-4.2$
bash-4.2$ cat nsswitch.conf |grep hosts
hosts:      files dns myhostname
Comment 10 Quanah Gibson-Mount 2020-07-16 21:33:21 UTC
master:

  • 24b45f57 
by Howard Chu at 2020-07-16T21:08:36+01:00 
ITS#9287 use getaddrinfo for ldap_pvt_get_fqdn

If getaddrinfo is available, should use it here
Comment 11 Quanah Gibson-Mount 2020-07-16 21:35:03 UTC
RE24:

commit c91cafcf10b9e4591ea666beda571b9a3d4d55bc
Author: Howard Chu <hyc@openldap.org>
Date:   Thu Jul 16 21:08:36 2020 +0100

    ITS#9287 use getaddrinfo for ldap_pvt_get_fqdn

    If getaddrinfo is available, should use it here
Comment 12 happy mu 2020-07-20 10:38:56 UTC
so, will you fix this issue to use getaddrinfo instead of gethostbyname_r in 2.4.51  ? 

(In reply to Quanah Gibson-Mount from comment #11)
> RE24:
> 
> commit c91cafcf10b9e4591ea666beda571b9a3d4d55bc
> Author: Howard Chu <hyc@openldap.org>
> Date:   Thu Jul 16 21:08:36 2020 +0100
> 
>     ITS#9287 use getaddrinfo for ldap_pvt_get_fqdn
> 
>     If getaddrinfo is available, should use it here
Comment 13 Quanah Gibson-Mount 2020-07-20 15:01:35 UTC
(In reply to happy mu from comment #12)
> so, will you fix this issue to use getaddrinfo instead of gethostbyname_r in
> 2.4.51  ? 

Correct, that's why the target release on the bug is 2.4.51 and why it's been added to RE24, and why it's in the 2.4.51 Changelog.