[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: slapcat hangs v2.2.24

Thank you, that explanation helps. There are probably a lot of stale locks in there now because I have aborted slapcat a number of times now.

Let me ask this, would doing a db_verify also clear this up? The reason I ask is that last week when I was having this problem after a test slapadd load I did a db_verify and after that slapcat seems to have then worked every time.

Howard Chu wrote:

Curt Blank wrote:

Thanks, I'll give that a shot when I can schedule an outage, hopefully overnight tonight.

Further info: This is a new ldap server just bought online with v2.2.24, as such is was loaded from scratch using ldapadd without any errors. Then immediately after starting slapd this slapcat problem was there, so I'm just curious as to what the db_recovery is expected to do? I would think the db would be pristine from just being loaded.

Generally it is safe to use slapcat and slapd simultaneously when using back-bdb and back-hdb, but if slapd is busy doing database updates, it will have outstanding locks that may impede slapcat's progress. (And vice versa, slapcat only obtains read locks but it may temporarily prevent slapd from obtaining a write lock that it needs.)

If you see a hang like this when slapd is not running that generally means there are stale locks leftover in the database environment from an abnormal termination of some previous run. Also note that in the current version, interrupting slapcat before it completes will also leave stale locks in the database. That is what db_recover will clear up.

In the next release (2.3.xx) we have added automatic detection of abnormal shutdowns, which will make all recovery fully automated, and this kind of situation should never arise any more.

Howard Chu wrote:

If slapd is running, stop it first. Then do a db_recover on the database directory, then try it again.

Curt Blank wrote:

OK, this problem is back. It is a real problem I need help!

Whether I do a:

slapcat -f /etc/openldap/slapd.conf -l 20050423-0155-new.ldif

or a:

slapcat -f /etc/openldap/slapd.conf

it hangs and never exits and never returns the command line prompt.

Curt Blank wrote:

Never mind. Now I can't make it fail. I did nothing to fix it. It was there for 3 days and now it's not.

I suspect something else I was running was screwing up my environment/shell or process because I was seeing other weird behavior, I just put two and two together. If/When I see that other weird behavior I will try a slapcat at that point and see, I cannot force that other behavior at the moment either, my fix there was to ^D and su again to make it go away. I couldn't even read man pages when it acted up.

I do not like unexplained anomalies, so I will pursue this.

Curt Blank wrote:

This appears to be some sort of buffer flush issue, even though I see it get to the end the last 4 entries that the debug info shows it processed are not in the output file.

I really could use some help here, this is a show stopper.

Curt Blank wrote:

I'm trying to do a slapcat and it dumps all the entries but never exits and returns the command line prompt. I put it in debug mode (-d 1) and it takes under a minute to do the dump and then just sits there. I'm pretty sure it dumped all the entries because I see the entry that is usually the last one dumped. I'm not doing anything fancy just:

slapcat -d 1 -f /etc/openldap/slapd.conf -l 20050419-1326.ldif

and the last thing it outputs after what I know is the last user entry is:

entry_decode: "cn=ldapsync,o=uwm.edu"
<= entry_decode(cn=ldapsync,o=uwm.edu)
slapcat shutdown: initiated
====> bdb_cache_release_all
slapcat shutdown: freeing system resources.

then sits there. Any ideas what is happening? I let it sit for 45 minutes one time and it never came back. I've tried it with slapd running and with it not running, same result.