[Date Prev][Date Next] [Chronological] [Thread] [Top]

(more) concurrency issues in libldap_r?



Recently, I've been heavily stressing back-ldap and back-meta for a
project we're working at.  As a result, they have been both strengthened
quite a bit, but now it appears that I'm facing some odd and very rare
concurrency issues in libldap.  I've done a little bit of work in that
area as well; they're much more reduced; what seems to be left right now
is a bit of locking issues when something goes wrong, e.g. a referral
cannot be chased, or a server down occurs.  This may typically occur with
test039 run for hours (-l 100000) on heavily loaded SMP architectures with
no logging (-d 0 -s 0).  To try and work it out, I've replaced most of the
calls to

        ldap_pvt_thread_mutex_lock( &some_mutex );

with

        while ( ldap_pvt_thread_mutex_trylock( &some_mutex ) )
                ldap_pvt_thread_yield();

This seems to cure the problem (up to now); other issues mostly consisted
in having ldap_int_flush_request() trying to flush an invalid Sockbuf;
this seems to have disappeared, although I haven't done anything explicit
to fix it (all my attempts gave no results), except for locking
ld->ld_req_mutex before calling ldap_free_connection() inside
try_read1msg(); all other instances of that call occurred under the same
lock.

Would these fixes be acceptable?  Did I overshoot the problem?  I guess
only selected calls to multex_lock() actually needed to be turned into a
trylock...

p.

-- 
Pierangelo Masarati
mailto:pierangelo.masarati@sys-net.it


    SysNet - via Dossi,8 27100 Pavia Tel: +390382573859 Fax: +390382476497