[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: make test hangs

Hi Howard:

2.1.8 is the version we started and tried out here. Once I found the hanging problem, I did try the newer version which was available then 2.1.12, and it had the same problem. So I decided to go ahead and debug and see what exactly went wrong and figure out a fix instead of waiting for an answer, since I didn't receive a reply saying that someone is looking into this problem and trying a fix from the group. Until yesterday, Mike who also had the same problem told me that this problem is going away with the new release 2.1.14.


Cindy Wang

Howard Chu wrote:

This bug has been fixed in CVS already, although not yet released.
Why are you building such an old release as 2.1.8?

-- Howard Chu
Chief Architect, Symas Corp. Director, Highland Sun
http://www.symas.com http://highlandsun.com/hyc
Symas: Premier OpenSource Development and Support

-----Original Message-----
From: owner-openldap-software@OpenLDAP.org
[mailto:owner-openldap-software@OpenLDAP.org]On Behalf Of Cindy Wang


Last week or so, I reported "make test" hangs for openldap.2.1.8 over Tru64 UX5.1 machine which is 64-bit machine. By running debugger against lsapsearch code, I know where the code hangs. The following is the calling stack when the it hangs:

0 __read(0x3ff801becd0, 0xffffffffffffffe6, 0x334, 0x3ffc0087f58, 0x120070348
) [0x3ff800cd848]
1 sb_stream_read(sbiod = 0x14002a120, buf = 0x140033000, len = 16384) ["sockb
uf.c":490, 0x12006fe98]
2 sb_rdahead_read(sbiod = 0x14002a150, buf = 0x14002b037, len = 3) ["sockbuf.
c":651, 0x12007039c]
3 sb_debug_read(sbiod = 0x14002a270, buf = 0x14002b037, len = 17) ["sockbuf.c
":816, 0x120070ad4]
4 ber_int_sb_read(sb = 0x14002a0f0, buf = 0x14002b037, len = 17) ["sockbuf.c"
:405, 0x12006fc68]
5 ber_get_next(sb = 0x14002a0f0, len = 0x11fff9c18, ber = 0x14002b020) ["io.c
":482, 0x12006bdcc]
6 try_read1msg(ld = 0x140028c00, msgid = -1, all = 1, sb = 0x14002a0f0, lc =
0x140031200, result = 0x11fff9e20) ["result.c":438, 0x1200547fc]
7 wait4msg(ld = 0x140028c00, msgid = -1, all = 1, timeout = (nil), result = 0
x11fff9e20) ["result.c":304, 0x120054418]
8 ldap_result(ld = 0x140028c00, msgid = -1, all = 1, timeout = (nil), result
= 0x11fff9e20) ["result.c":113, 0x120053f24]
9 dosearch(ld = 0x140028c00, base = 0x14002a000 = "o=University of Michigan,
c=US", scope = 2, filtpatt = (nil), value = 0x1400005c8 = "(objectclass=*)", att
rs = 0x11fffc078, attrsonly = 0, sctrls = (nil), cctrls = (nil), timeout = (nil)
, sizelimit = -1) ["ldapsearch.c":1154, 0x12003fd3c]
10 main(argc = 13, argv = 0x11fffc018) ["ldapsearch.c":1063, 0x12003f97c]

I also turned on the debug flag to be -d -1. By looking at those files, I realized that it is the last response PDU message to the search request(last 14 bytes), which indicate all the entries have been returned and the server has completed the search request. It is this last message which the client is not processing properly, the other 19 entries have been processed correctly. What happened was the client was trying to read 17 bytes for the header( first read), while only 14 bytes are available. That is when the block occurred. I tested the possible fix ( change BER_TAG_T and BER_TAG_T from long type to int type), since in that case the client need to read 9 bytes for 64-bit type of machine as well as the 32-bit type of machine for the first read (header read), so it won't need to wait to read 17 bytes while there are only 14 bytes being returned for the last packet. The hanging problem is gone now. I also applied the fix for segmentable fault which occurred on 64-bit machine, which Mike told me is available from openldap.2.1.14 release. And 16 testing cases are running O.K now. Thanks.


Cindy Wang

Cindy Wang wrote:


I am finally able to build


on Tru64 Unix V5.1. But when I do the test of the build

(make test),

the first time, nothing was successful - it hangs for test000. The second time, test000 succeeded, but the test001 hangs. And the following is the message I got for the second test and later:

cd tests; make test
ln: ./data and ./data are identical.
*** Exit 1 (ignored)
ln: ./schema and ../servers/slapd/schema are identical.
*** Exit 1 (ignored)
ucdata/liblunicode: File exists
*** Exit 1 (ignored)
Initiating LDAP tests for BDB...

Executing all LDAP tests...
Test Directory: .
Backend: bdb
Starting test000-rootdse ...

running defines.sh
Datadir is ./data
Cleaning up in ./test-db...
Starting slapd on TCP/IP port 9009...
../servers/slapd/slapd -s0 -f ./test-db/slapd.conf -h ldap://localhost:9009/ -d
Using ldapsearch to retrieve the root DSE...
objectClass: top
objectClass: OpenLDAProotDSE
structuralObjectClass: OpenLDAProotDSE

namingContexts: o=OpenLDAP Project,l=Internet
supportedControl: 2.16.840.1.113730.3.4.2
supportedControl: 1.2.826.0.1.334810.2.3
supportedLDAPVersion: 3
supportedSASLMechanisms: GSSAPI
supportedSASLMechanisms: OTP
supportedSASLMechanisms: DIGEST-MD5
supportedSASLMechanisms: CRAM-MD5
subschemaSubentry: cn=Subschema

Test succeeded
./scripts/test000-rootdse completed OK.
waiting 10 seconds for things to exit

Starting test001-slapadd ...

running defines.sh
Datadir is ./data
Cleaning up in ./test-db...
Running slapadd to build slapd database...
Starting slapd on TCP/IP port 9009...
Using ldapsearch to retrieve all the entries...

Does anyone have any insight what might be going on? I

actually tried

to build with and without the thread option


and result is the same. And I also tried different kinds of options for the build (i.e. with -pthread option and link with -lpthread library), it doesn't help anything. I am running out of

ideas, so if

anyone has any insight to this, please help.

Cindy Wang