[Date Prev][Date Next]
Re: (ITS#6276) paused pool can deadlock if writers are waiting
- To: openldap-its@OpenLDAP.org
- Subject: Re: (ITS#6276) paused pool can deadlock if writers are waiting
- From: firstname.lastname@example.org
- Date: Wed, 26 Aug 2009 08:45:30 GMT
- Auto-submitted: auto-generated (OpenLDAP-ITS)
Hallvard B Furuseth wrote:
> This is a quick abstract answer, I haven't dived into the code to see
> what my suggestions would mean in practice. Anyway:
> It sounds to me like this problem cannot be fully prevented when network
> operations share the same pool queue as everything else, and we have
> operations that can wait for other operations. All pool threads can be
> occupied with LDAP-level operations, so that network-level operations
> which the LDAP-level operations depend on can get blocked. If slapd has
> a design which even tries to guarantee forward progress, I'm not aware
> of it.
> So LDAP-level operations ought to leave at least one thread free for for
> network-level operations. Not necessary a designated thread: A thread
> moving from network-level to LDAP-level operation like
> connection_read_thread() does could first check that at least one other
> thread remains available for network-level operation.
> Beyond that, my immediate reaction was that pauses are implemented at
> the wrong level and/or need to be split up in different types of pauses.
> There is no good reason cn=config's need to have slapd config variables
> for itself to affect network operations - nor, I hope, affect pool-level
> operations like pool_purgekey().
I suppose implementing things this way can appear to be too blunt. But the
amount of locks required to do it at a finer level is, IMO, unmanageable. The
fact that you can change global slapd config parameters (such as the number of
threads, size of sockbuf buffers, etc.) makes it inherently safe for
*anything* else to be active while config changes are being made. And sifting
thru each variable to decide how sensitive they are, and when they are unsafe,
will inevitably lead to requiring locks on every single piece of configuration
data. (Want to look up an attribute type, or objectclass? Or a database
suffix? etc. etc. etc... There are countless things we do arbitrarily that
simply won't work if we allow arbitrary threads to continue to run while their
configurations are changed from under them.)
> E.g. cn=config could use a slapd_config_pause() call to ask for lone
> access to the config variables instead of thread_pool_pause() call.
> Slapd operations that must respect such pauses should thus call a
> slapd_config_pausecheck() macro/function, not thread_pool_pausecheck()
> or depend on the pool to respect slapd-level pauses.
> connection_read_thread() can then go ahead and use the connection
> independently of those pauses. If it reaches connection_operation(),
> it'll need to check for slapd pause.
Yes, I've thought about this approach too. That makes the pause mechanisms
Another possibility is to just try to read 1 byte in the listener thread, to
detect the hangup there when we have no other means to discover it. We would
have to make sure to be able to unget this byte back to the bottom of the
sockbuf stack if there's valid data. This will affect the throughput of the
listener thread, but it may not be too terrible a hit.
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/