[Date Prev][Date Next] [Chronological] [Thread] [Top]

1.2.11 + Linux 2.4.4 grief

	We installed OpenLDAP 1.x on our new Linux server with the
intention of using it as a common authentication source for our Linux
clients as well as a few Win2K boxes (via Samba TNG).  We're using
1.2.11 instead of 2.x because there are nice instructions available
for doing the Samba integration with 1.x that don't seem to exist for
2.x.  Anyway this $%#@ing machine isn't stable; LDAP and/or NFS are
freezing up on us regularly.  The symptoms are either: a "you do not
exist, go away" message from pam_ldap due to a failed LDAP query to
the server or file access on NFS mounted home directories just dies.
The server is still running and you can login to it via ssh as root
(but not as any user, obviously).  If I do an strace on slapd (the
LDAP server) when it's "hung" I get this, which seems a little weird :

# strace -d -p `cat /var/run/slapd.pid`
 [wait(0x137f) = 18741]
pid 18741 stopped, [SIGSTOP]

	I certainly haven't sent the process a STOP signal, what else
could be doing this?  I would expect it to be waiting on a network
socket instead of this.  (Incidentally, running strace on nfsd causes
the shell to completely hang).  There isn't anything else useful in
the logs.  I have switched off nscd which I gather you shouldn't run
on an LDAP (or NIS) server.  Doing a kill -9 on slapd and restarting
LDAP fixes this (temporarily).

	Does anyone have any clue what's going on here, or just any
ideas on how we should proceed?  I'm beginning to hate this machine a
*lot* (setting up LDAP was hugely painful and now it doesn't work).

	Configuration is: 900MHz Athlon, 256MB, Promise IDE RAID,
4x60GB storage (+ 20GB "boot" drive), dual Intel 10/100 Ethernet,
Linux 2.4.4 (+ recent reiserfs nfsd patches to cure filesystem
corruption problems), the four 60GB drives are striped using software
RAID5 into one volume with LVM on top of that and reiserfs inside the
logical volumes, base distribution is SuSE 7.1.  I have my doubts about
reiserfs in this scheme: is anyone running a similar setup sucessfully.


James Macnicol