[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: IDL bit vectors

To: OpenLDAP Devel <openldap-devel@openldap.org>
Subject: Re: IDL bit vectors
From: Howard Chu <hyc@symas.com>
Date: Sun, 16 Nov 2008 10:09:46 -0800
In-reply-to: <48FBB578.2050306@symas.com>
References: <48FB0920.6080205@symas.com> <13ca621c0810191434o785461a6kcc22dbeac002ac24@mail.gmail.com> <48FBB578.2050306@symas.com>
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; rv:1.9.1b2pre) Gecko/20081115 SeaMonkey/2.0a1pre Firefox/3.0.3

Howard Chu wrote:

Realistically, at about 2Kbytes per entry, a 32 bit machine could directly
address at most ~2^20 entries (~1M) so an IDL that maxes out at 1.86M before
losing precision may be perfectly acceptable.

For 64 bit machines it seems 50 bits (1 petabyte) is the largest physical
address space currently available. That puts a practical upper bound, assuming
4Kbytes per entry, at around 2^37 entries (~128 billion) directly addressable.
It would take 2^34 bytes to handle that directly, 16GBytes, which again is
still impractical. So again, we need some kind of range representation in
addition to the actual bit vectors.

I think using <base,vector> with fixed-length vectors will be the way forward. When an IDL reaches the maximum length, and a new <base,vector> needs to be added, the entire IDL will be scaled by 2. On most 32 bit installs this will probably never happen; on 64 bit installs it will happen infrequently.

It may be possible/desirable to simply scale one piece of the IDL at a time. Anything that can coalesce two vectors into one will be sufficient to allow the next insertion to proceed. The scale factor can be stored in the unused bits of the <base>...

Gavin.

On 10/19/08, Howard Chu<hyc@symas.com> wrote:

Seems like I keep poking at this idea again every couple years.

If we use a<base,vector>   we can cover a much wider range without losing any
precision. For a 32 bit machine, 32 bits per word would cover 5 bits worth
of
entry IDs. So 27 bits would comprise the base, and the next 32 bits would
represent up to 32 entry IDs.

That allows us to represent up to 256 entryIDs in only 288 bits (instead of
the 16384 bits we currently need). That would allow us to track about 1.86M
entryIDs in the same space we currently use to track only 64K entryIDs.

It seems we would still need some kind of range representation to handle
IDLs
with more than 1.86M entries.

--
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/

Prev by Date: Re: "widely supported scripting language" for make test (was ITS#5809)
Next by Date: Re: commit: ldap/servers/slapd/overlays dynlist.c
Index(es):
- Chronological
- Thread