[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
large multivalued attributes
- To: OpenLDAP Devel <openldap-devel@OpenLDAP.org>
- Subject: large multivalued attributes
- From: Howard Chu <hyc@symas.com>
- Date: Wed, 16 Aug 2006 12:09:27 -0700
- User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a1) Gecko/20060716 SeaMonkey/1.5a
This issue has been around for a long time, and I think we're going to
need to deal with it soon, as we see sites creating groups with
thousands of members.
There's a couple different aspects that need to be addressed.
1) finding an individual value in a large multivalued attribute is
inefficient(all linear searches).
2) updating a single indexed value basically requires regenerating the
index for the entire attribute.
The only obvious solution to 1) seems to be to always maintain
multivalued attributes in sorted order, so that we can use binary search
to locate values. This would make filter comparisons faster, as well as
address things like ITS#4600, and locating values to modify in modify
operations.
The most efficient fix for 2) requires us to use an index mechanism that
avoids collisions, so that we can safely remove index keys for
individual values without worrying about losing indexing for an
unrelated value. I don't see that actually happening, though. The next
alternative would be to use an index mechanism that allows duplicate
keys, so that we can accumulate them when collisions occur. Then
removing a single instance for one value doesn't affect the instances
for other values.
We could accomplish the latter goal by changing the index structure in
the DB. Currently we use a hash key and the data is a list of entryIDs
for entries that match the hash. We could change the list to be entryID+
counter, so we can keep track of these collisions.
Unfortunately keeping a counter like this would make slapindex a lot
harder, we'd somehow have to know not to increment the counter when
we're just re-indexing something that was already indexed. (Or we could
just empty out/truncate the entire index before processing anything.)
--
-- Howard Chu
Chief Architect, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc
OpenLDAP Core Team http://www.openldap.org/project/