[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Continued instability on Solaris 7



At 09:27 AM 2002-08-05, Alan Sparks wrote:
>Didn't receive any response to earlier postings, so one last try to see if
>anyone else is having problems (or success!) on Solaris 7 with OpenLDAP
>2.1.3.

Haven't a clue about Solaris (SunOS 5.x), but I'll make a few
general comments.

>I've upgraded to ENG_RELEASE_2_1 as of about a week ago,

We've made a number of updates to re21 since then...

>and made sure I
>build with BDB 4.0.14 statically linked.  Built with GCC 2.95.2.

You might want to very BDB is working properly.  Sleepycat
includes a test suite as part of the BerkeleyDB distribution.
With GCC, use "-O -g" (e.g., no -O2 or greater).

>Replication stops on random occasions, either after a couple of days or a
>couple of hours.  The master (very lightly loaded) shows entries written
>to the replog, and the slurpd.status log shows entries also picked up.
>
>Unfortunately, nothing updates in the slave (same software build). 

What do the slave's logs say is happening?

>Sending a SIGTERM to the slave slapd causes it to stop responding and
>suddenly begin consuming 80% of CPU, until sent a SIGKILL.

It's likely trying hard to complete an update.  The SIGKILL,
of course, causes it to drop the update on the floor.

>Restarting it afterwards gets back to normal replication (for a while).  I have yet to
>find something in the log indicating the issue.

>I've rebuilt and reindexed the databases (LDBM backend) to no avail.  Is
>the BDB backend considered better for these purposes?  I'm using LDBM
>since it's older (and I hoped more mature) than the bdb backend...

We'll still shaking out BDB.  It works well enough for general
use, but it's not "stable" yet.

>Does anyone have any insight into why this might happen?  The instability
>is really causing havoc on my network, since I can't depend on whether
>changes get written.  Thanks in advance for any advice.