[Date Prev][Date Next]
Re: IDL bit vectors
- To: OpenLDAP Devel <firstname.lastname@example.org>
- Subject: Re: IDL bit vectors
- From: Howard Chu <email@example.com>
- Date: Sun, 16 Nov 2008 10:09:46 -0800
- In-reply-to: <48FBB578.firstname.lastname@example.org>
- References: <48FB0920.email@example.com> <firstname.lastname@example.org> <48FBB578.email@example.com>
- User-agent: Mozilla/5.0 (X11; U; Linux x86_64; rv:1.9.1b2pre) Gecko/20081115 SeaMonkey/2.0a1pre Firefox/3.0.3
Howard Chu wrote:
Realistically, at about 2Kbytes per entry, a 32 bit machine could directly
address at most ~2^20 entries (~1M) so an IDL that maxes out at 1.86M before
losing precision may be perfectly acceptable.
For 64 bit machines it seems 50 bits (1 petabyte) is the largest physical
address space currently available. That puts a practical upper bound, assuming
4Kbytes per entry, at around 2^37 entries (~128 billion) directly addressable.
It would take 2^34 bytes to handle that directly, 16GBytes, which again is
still impractical. So again, we need some kind of range representation in
addition to the actual bit vectors.
I think using <base,vector> with fixed-length vectors will be the way forward.
When an IDL reaches the maximum length, and a new <base,vector> needs to be
added, the entire IDL will be scaled by 2. On most 32 bit installs this will
probably never happen; on 64 bit installs it will happen infrequently.
It may be possible/desirable to simply scale one piece of the IDL at a time.
Anything that can coalesce two vectors into one will be sufficient to allow
the next insertion to proceed. The scale factor can be stored in the unused
bits of the <base>...
On 10/19/08, Howard Chu<firstname.lastname@example.org> wrote:
Seems like I keep poking at this idea again every couple years.
If we use a<base,vector> we can cover a much wider range without losing any
precision. For a 32 bit machine, 32 bits per word would cover 5 bits worth
entry IDs. So 27 bits would comprise the base, and the next 32 bits would
represent up to 32 entry IDs.
That allows us to represent up to 256 entryIDs in only 288 bits (instead of
the 16384 bits we currently need). That would allow us to track about 1.86M
entryIDs in the same space we currently use to track only 64K entryIDs.
It seems we would still need some kind of range representation to handle
with more than 1.86M entries.
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/