Issue 3313 - Slurpd not robust against errors
Summary: Slurpd not robust against errors
Status: VERIFIED FIXED
Alias: None
Product: OpenLDAP
Classification: Unclassified
Component: slapd (show other issues)
Version: unspecified
Hardware: All All
: --- normal
Target Milestone: ---
Assignee: OpenLDAP project
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-09-01 11:56 UTC by steffen@kdab.net
Modified: 2009-02-17 05:26 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description steffen@kdab.net 2004-09-01 11:56:30 UTC
Full_Name: Steffen Hansen
Version: 2.2.14
OS: OpenPKG-2.1 on Linux
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (80.63.196.222)


I have a problem with slurpd exiting in case of unexpected results:

My setup consist of one master LDAP server and one slave server plus one
additional replica per server (custom app listening for LDAP changes --
kolabd).

As far as I can see from the code, slurpd starts a thread for each replica, and
when all those threads exit, slurpd exists. If slurpd receives unexpected an
response from a replica or the connection is torn down while a transaction is in
progress, the replica thread fails and replication for that node is dead until
slurpd is restarted. Once this has happened for all replicas, slurpd exits:

ldap_write: want=36, written=36
  0000:  30 22 02 01 03 42 00 a0  1b 30 19 04 17 32 2e 31   0"...B...0...2.1
  0010:  36 2e 38 34 30 2e 31 2e  31 31 33 37 33 30 2e 33   6.840.1.113730.3
  0020:  2e 34 2e 32                                        .4.2
ldap_free_connection: actually freed
end replication thread for 127.0.0.1:9999
slurpd: terminated.


Is there any way to get around this? I'd like slurpd to just try to reconnect if
something went wrong. Please contact me if you need additional information.

Comment 1 steffen@kdab.net 2004-09-01 22:06:56 UTC
Ok, it turns out that slurpd does not kill itself -- it is signalled to 
death and restarted by an init script. But still, it seems that under 
some circumstances it is possible to "hang" slurpd so it stops 
replicating.

I admit this is difficult for you to reproduce, but I managed to get 
into a state where the slurpd pid file was removed (by slurpd) while I 
still had a slurpd process running. This process did not respond to 
SIGINT and required a SIGKILL to terminate.

Comment 2 Kurt Zeilenga 2004-09-02 04:45:15 UTC
changed state Open to Closed
Comment 3 Howard Chu 2009-02-17 05:26:56 UTC
moved from Incoming to Archive.Incoming