[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: OpenLDAP syncrepl message: AttributeDescription contains inappropriate characters




Enabling LogLevel=sync gave me enough information to find out what fails:

Dec 15 10:44:27 forward01.unix slapd[9470]: [ID 349886 local4.debug] syncrepl_entry: rid 405 uid=desk,o=xxxx.xxx,ou=mail

Dec 15 10:44:27 forward01.unix slapd[9470]: [ID 190661 local4.debug] <= entry_decode: slap_str2undef_ad( 20091027142315Z#000001#00#000000): AttributeDescription contains inappropriate characters


I did a ldapsearch to dump that node, then ldapdelete and ldapadd -f ldif to restore it back in, in the hope that it would fix it. The entry on ldapmaster looked correct though.

The client spent considerably longer in sync, but eventually still died on the same entry (and very last entry in sync).

Turns out that the ldapmaster is correct, but all clients that replsync from master are all stuck on that entry:

# /usr/local/bin/ldapsearch -h forward01.unix  -b uid=desk,o=xxxx.xxx,ou=mail

# search result
search: 2
result: 80 Internal (implementation specific) error
text: internal error


If I run slapindex on one of the clients I get:


<= entry_decode: slap_str2undef_ad( 20091027142315Z#000001#00#000000): AttributeDescription contains inappropriate characters
bdb_tool_entry_reindex: could not locate id=289608


Similarly slapcat:

# slapcat -s uid=desk,o=xxxx.xxx,ou=mail

<= entry_decode: slap_str2undef_ad( 20091027142315Z#000001#00#000000): AttributeDescription contains inappropriate characters
# no data for entry id=00046b48


So, for some reason all clients got this wrong record everywhere. Luckily, they are all slaves, so I can delete the entire database and let each one catch up. It is a shame I can't even just delete the bad record.

Lund


Jorgen Lundman wrote:

openldap-2.3.41
db-4.2.52.NC-PLUS_5_PATCHES
SunOS ldapmaster01.unix 5.10 Generic_127128-11


Our LDAP setup has been running rather well for the last 2 years or so,
but increasingly we have had to restart slapd more and more frequently.
Occasional core-dumps and "very large" process footprints as well.

I suspect that it is perhaps stuck on replication, this message show up
on the clients fairly frequently:

Dec 14 11:30:49 forward01.unix slapd[10494]: [ID 190661 local4.debug] <=
entry_decode: slap_str2undef_ad( 20091027142315Z#000001#00#000000):
AttributeDescription contains inappropriate characters

Dec 14 11:30:49 forward01.unix slapd[10494]: [ID 818565 local4.debug]
null_callback: error code 0x50

Dec 14 11:30:49 forward01.unix slapd[10494]: [ID 776556 local4.debug]
syncrepl_entry: rid 405 be_add failed (80)

Dec 14 11:30:49 forward01.unix slapd[10494]: [ID 747041 local4.debug]
do_syncrepl: rid 405 retrying (9 retries left)


Which makes me think that it has been stuck since 20091027. Is there a
way for me to find out which entry 20091027142315Z#000001#00#000000
refers to, so I can just delete it? (or fix it).



The pertinent lines in ldapmaster are:

lastmod on
checkpoint 128 15
directory /usr/local/var/openldap-data

index objectClass eq
index uid eq
index uidNumber eq
index mail eq
index mailAlternateAddress pres,eq
index deliveryMode eq
index accountStatus eq
index gecos eq
index radiusGroupName eq
index o pres,eq
index entryCSN,entryUUID eq
index gidNumber eq

overlay syncprov

syncprov-checkpoint 100 10
syncprov-sessionlog 100




And each slave has identical conf, except for the RID which is based on
the last octet of the IP address:

lastmod on
checkpoint 128 15
directory /usr/local/var/openldap-data

index objectClass eq
index uid eq
index uidNumber eq
index mail eq
index mailAlternateAddress pres,eq
index deliveryMode eq
index accountStatus eq
index gecos eq
index radiusGroupName eq
index o pres,eq
index entryCSN,entryUUID eq
index gidNumber eq

syncrepl rid=405
provider=ldap://172.20.12.113
type=refreshAndPersist
interval=00:00:00:30
searchbase="ou=mail,dc=gmo,dc=jp"
filter="(objectClass=*)"
attrs="*"
scope=sub
schemachecking=off
updatedn="cn=admin,dc=gmo,dc=jp"
bindmethod=simple
binddn="cn=admin,dc=gmo,dc=jp"
credentials="*censored*"
retry="60 10 300 +"

updateref ldap://172.20.12.113



In general, do they seem reasonable? Everything has been appearing to be
correct, except for the 20091027 entry. What is the recommended
procedure? Reading the documentation makes me think I should also index
"ContextCSN"?

A recent ldapmaster core was triggered here:
#0 0xce7948da in t_delete () from /lib/libc.so.1
#1 0xce79417f in _malloc_unlocked () from /lib/libc.so.1
#2 0xce794058 in malloc () from /lib/libc.so.1
#3 0x08133175 in ber_memalloc_x ()
#4 0x08133487 in ber_dupbv_x ()
#5 0x0813352b in ber_dupbv ()
#6 0x0807c22e in attr_dup ()

... which I believe is replication-related due to a post made earlier in
the list. Most likely also fixed in later versions.


Thanks for any reply,

Jorgen Lundman



--
Jorgen Lundman       | <lundman@lundman.net>
Unix Administrator   | +81 (0)3 -5456-2687 ext 1017 (work)
Shibuya-ku, Tokyo    | +81 (0)90-5578-8500          (cell)
Japan                | +81 (0)3 -3375-1767          (home)