[Date Prev][Date Next] [Chronological] [Thread] [Top]

ch_malloc: same error, different OS



Slapd crashes every night leaving the 'ch_malloc of 8388608 bytes failed' 
error.

My server is running on an OpenBSD 3.3 machine with 256M ram.  I have 
about 100 clients which are mac OSX machines that all reboot at about the 
same time every night, creating a lot of ldap chatter, which seems to kill 
slapd.

This sounds an awful lot like the error other people were seeing on sun 
boxes, and it looks like they solved it primarily by patching their OS, so 
those solutions won't work for me.  

The wierd thing is, I don't seem to be running out of memory.  I tried
running vmstat, reporting once every second, over the three minutes
surrounding exactly when slapd always dies, and the system usage barely
rippled.  I thought maybe there were resource limits in the way, so I
stuffed in a little code to dump the results of getrlimits() and the
memory limits were the same number as the amount of memory on my system,
for effectively no limits.  I also had it dump the results from
getrusage() when ch_malloc failed, and none of those numbers seemed
particularly high.  I don't really know what all the output ment, so I
could be wrong.  Maybe it just is using up all that memory and my
techniques for watching the memory usage are flawed.

Here's some details.

OpenBSD 3.3 
cpu0: Intel Pentium III (Coppermine) ("GenuineIntel" 686-class) 927 MHz
real mem  = 267960320 (261680K)
avail mem = 242638848 (236952K)

openldap-2.1.22		(compiled myself)
bdb-4.1.25		(compiled myself)
cyrus-sasl-2.1.14	(compiled myself)
OpenSSL 0.9.7-beta3 30 Jul 2002 (part of OpenBSD)

Here's the gunk from my syslog.  The 1st chunk is just there for context.  
Then we see ch_malloc's error message.  The next bunch of stuff is a
report of the resource limits on the process according to getrlimits.  
The last entry is the results from getrusage.

Sep 10 22:05:00 celeste slapd[27433]: => bdb_dn2id_matched( "cn=users,dc=seattlecentral,dc=edu" ) 
Sep 10 22:05:00 celeste slapd[27433]: ====> bdb_cache_find_entry_dn2id("cn=users,dc=seattlecentral,dc=edu"): 4 (1 tries) 
Sep 10 22:05:00 celeste slapd[27433]: ====> bdb_cache_find_entry_id( 4 ) 
"cn=users,dc=seattlecentral,dc=edu" (found) (1 tries) 
Sep 10 22:05:00 celeste slapd[27433]: search_candidates: 
base="cn=users,dc=seattlecentral,dc=edu" (0x00000004) scope=2 
Sep 10 22:05:00 celeste slapd[27433]: ch_malloc of 8388608 bytes failed 
Sep 10 22:05:00 celeste slapd[27433]: CPU Limits:hard = -1,                     
soft = 2147483647
Sep 10 22:05:00 celeste slapd[27433]: FSIZE Limits:hard = -1,                   
soft = 2147483647
Sep 10 22:05:00 celeste slapd[27433]: DATA Limits:hard = 268435456,             
        soft = 0
Sep 10 22:05:00 celeste slapd[27433]: STACK Limits:hard = 33554432,             
        soft = 0
Sep 10 22:05:00 celeste slapd[27433]: CORE Limits:hard = -1,                    
soft = 2147483647
Sep 10 22:05:00 celeste slapd[27433]: RSS Limits:hard = 242585600,              
        soft = 0
Sep 10 22:05:00 celeste slapd[27433]: MEMLOCK Limits:hard = 242585600,          
                soft = 0
Sep 10 22:05:00 celeste slapd[27433]: NPROC Limits:hard = 128,                  
        soft = 0
Sep 10 22:05:00 celeste slapd[27433]: NOFILE Limits:hard = 1024,                
        soft = 0 
Sep 10 22:05:00 celeste slapd[27433]: maxrss = 0, ixrss =
0, idrss = 0, isrss = 0, minflt = 864, majflt = 205nswap = 0, inblock =
171, oublock = 30msgsnd = 5752 0, msgrcv = 0, nsingnals = 254nvcsw = 2715,
nivcsw = 23872

Thanks
-Dylan Martin, Unix Admin, Seattle Central Community College.