[Date Prev][Date Next] [Chronological] [Thread] [Top]

RE: Master replication server reliability



Tony,
	Thanks for the help.  We're using HP ProLiant DL380s with 2x2.4 Ghz P4 Xeon, 2 GB RAM, 2x36GBx10K SCSI RAID 1 using HP HW RAID card (thus one logical drive that the OS breaks up into n logical mount points).  I'd think this would be sufficient for what we do.  I'm in the process of building db 4.1.52 with the patches and I'm rebuilding 2.1.25 to run against that for comparison.  I'm also building OpenLDAP 2.2.15 as suggested by you and Quanah with the hopes of testing that today.  Please let me know if anything about my HW setup concerns anyone.  I realize that one logical drive (2 disk RAID 1) doesn't allow for putting the DB logs on another spindle but we do so few writes I can't see this being problematic and our caching for both OpenLDAP and DB permits a 100% cache hit rate according to db_stat so our disk dependance should be minimal.  My DB_CONFIG contains only the following:

#Set cachesize to 32MB
set_cachesize 0 33554432 0

Thanks again!

Jamey

-----Original Message-----
From: owner-openldap-software@OpenLDAP.org
[mailto:owner-openldap-software@OpenLDAP.org]On Behalf Of Tony Earnshaw
Sent: Thursday, August 12, 2004 10:07 AM
To: Openldap list
Subject: Re: Master replication server reliability


tor, 12.08.2004 kl. 02.43 skrev James Courtney:

> I realized that it would be better to have two distinct emails with
> descriptive subjects for my two issues...
> 
> We moved to a replicated system yesterday (1 master and 2 slaves) and
> our mail server components authenticate against the two slaves and our
> updates are done against the master which receives no other traffic. 
> The slave servers have been up since the change but the master server
> has been going down from time to time.  I realized that my CGI was
> very good about unbinding but wasn't calling disconnect on the LDAP
> object when done so I've added this logic.  Could the failure to
> disconnect rather than just unbind cause such problems.  The CGI would
> also exit after running each time so the client endpoint to the LDAP
> server would go away with each run of the CGI.  I'd think that not
> disconnecting explicitly before exiting in the code wouldn't be ideal
> but wouldn't cause the slapd process great difficulty.  Would this
> sort of thing be likely to cause severe slapd problems?
> 
> Also, sometimes slurpd goes down and when it does I end up with a 2 GB
> (exactly) slurpd.replog file containing a great deal of redundant
> update data.  Has anyone else seen this?  Slurpd seems to consume 20%
> + of the CPU on a dual 2.4 Ghz. P4 system with 2 GB RAM while things
> are running normally.  This seems excessive.  Is this symptomatic of
> anything?
> 
> Again my system is Redhat Enterprise 3 ES with OpenLDAP 2.1.25
> (back-bdb), BDB 4.2.52, and OpenSSL 0.9.7d.

Ok, this would be your original posting. You haven't mentioned your
hardware make or configuration. Update your OL version to 2.2.15 -  the
rest should be ok *if* your back-end DB is configured correctly
(DB_CONFIG). RHEL3 has no problem with any of this and can bear
extremely heavy loads, but *only* on good hardware configurations
(well-designed server hardware, *plenty* of RAM, high disk I/O
capability, effective cooling etc) and correctly-configured db backends.
And there is a *world* of difference between the OL version you're using
and OL 2.2.15.

--Tonni

-- 
My resume - CV - says that I speak a few languages fluently. I
have academic qualifications in all of them. It doesn't mention
"catese" or "dogese". I speak fluent catese and reasonable dogese,
but I've never taken any exams in them, never needed to.

mail: tonye@billy.demon.nl
http://www.billy.demon.nl