[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Weird problem -- seeking insight



On Fri, 16 Jan 2004 at 2:13pm, Frank Swasey wrote:

> I have just upgraded this one R/O to openldap 2.1.25 and SleepyCat
> 4.2.52 patch 1 this afternoon and will see if it still fails tomorrow
> morning.

Over the weekend we didn't see a whole lot of trouble on the server
running 2.1.25 and BDB 4.2.52.  However, another of the R/O's started
acting up so we upgraded that one as well.

This morning, both of the R/O's that are used heavily for mail delivery
went down the tubes.

I've been searching source code and slapd logs all day and have
determined that left to their own devices they do eventually get through
the DEADLOCKS and move on.  However, the mail cluster is a smoldering
pile of rubble by the time that happens.  I have also determined that
the vast majority of the DEADLOCK messages are happening while dealing
with the memberUid index processing additions and deletions to our most
massive group (it has at this time 39,605 memberUid attribute values in
that one entry).

Do I remember correctly that when you add/delete values on an attribute
that the entire index entry for that attribute is deleted and recreated?
If so, with an entry that has 40,000 values, this is obviously going to
take some time -- and since this is also one of the attributes that the
mail servers are doing searches against frequently -- I can easily see
how a deadlock would happen repeatedly until it finally had a chance to
completely wipe this out and recreate it.

I begin to see why Quanah does a complete nuke and load process every
night (like I used to :().

So, the questions...
- Am I correct about the complete deletion and recreation of the index
entries for that entry?
- I think it's fairly obvious that a single group with 40000 members is
nasty to deal with -- anyone have a suggestion of how to efficiently
add/delete members in this group?
- Anyone know whether nss_ldap will get upset if I just leave that group
in there but never touch it's members (similar to having an /etc/group
file that doesn't list everyone in the group)?

Thanks,

-- 
Frank Swasey                    | http://www.uvm.edu/~fcs
Systems Programmer              | Always remember: You are UNIQUE,
University of Vermont           |    just like everyone else.
                    === God Bless Us All ===