[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: commit: ldap/tests/scripts defines.sh test035-meta test036-meta-concurrency



Michael Ströder wrote:

Pierangelo Masarati wrote:


test036 should now be ready to be enabled.  I'd appreciate if anybody
can try it out and report;
[..]
cd tests
DB_CONFIG=../servers/slapd/DB_CONFIG SLAPD_DEBUG=256 TEST_META=yes ./run
test036

What's mandatory is

TEST_META=yes ./run test036



It sometimes succeeds (as stated in the result message) but fails from now and then:

------------------------- begin -------------------------
./scripts/test036-meta-concurrency: line 174: 23927 Segmentation fault
$SLAPD -f $CONF3 -h $URI3 -d $LVL $TIMING >$LOG3 2>&1
Using ldapsearch to retrieve all the entries...
./scripts/test036-meta-concurrency: line 188: kill: (23927) - No such
process
------------------------- end -------------------------


Need to investigate; probably, there's an error in recording the PID of the process inside the test script

Various glibc errors:

------------------------- begin -------------------------
*** glibc detected *** double free or corruption (fasttop): 0x0835a8c8 ***

*** glibc detected *** free(): invalid pointer: 0x082de000 ***
------------------------- end -------------------------


This is something I'd like to be able to trace; can you create a core and run with MALLOC_CHECK_=2 so that the test aborts immediately? I've ben running it many times and didn't find any.

Even if the tests succeeds some messages look strange to me:

------------------------- begin -------------------------
ldap_search: No such object (32)


This is (sort of) OK; in some cases, you may get that error if the entry cannot be fetched. I'm trying to turn i into LDAP_BUSY or so.

ldap_search: No such object (32)
ldap_read: Server is busy (51)
PID=24716 - Read done (51).
PID=24735 - Modify done (0).
ldap_search: No such object (32)
ldap_search: No such object (32)..many of these messages...
------------------------- end -------------------------

I have SuSE Linux 9.3:
- gcc version 3.3.5 20050117 (prerelease) (SUSE Linux)
- glibc-2.3.4-23
- kernel-default-2.6.11.4-20a
- db-4.3.27-3 (Berkeley-DB 4.3.27)


In general, all the LDAP_BUSY and asynchronous calls in the test suite were added to track some problems arising with internals of back-meta hanging in some cases under heavy load (ITS#3464) because when back-meta uses an internal database as target, it comsumes one extra thread per connection, and using the synchronous calls would lead to a deadlock. So there's an internal fail-safe mech that aftr a certain numer of retries gives up and returns LDAP_BUSY to the client. This is a behavior you won't see e.g. with test008. I want to improve that fix by making it configurable; for instance, if one can accept sometimes a slow response, back-meta should retry "forever", as soon as there are no local targets and, as such, no dealock is possible. In other cases, the timing of the response may be essential; in these cases, an immediate LDAP_BUSY would be the best solution.

Usually, under heavy load (test036 plus a few instances of "ls -R /" and ping -f) I can see slapd-read (the slapd-tester client) returning few LDAP_BUSY; I've never seen a failure of the write clients, and I've never seen a failure of the slapd-search clients returning noSuchObject.

If you can provide further feedback, I'd be happy to fix these issues as well.

Thanks, p.

--
Pierangelo Masarati
mailto:pierangelo.masarati@sys-net.it



   SysNet - via Dossi,8 27100 Pavia Tel: +390382573859 Fax: +390382476497