[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: ordered indexing for integers

To: Hallvard B Furuseth <h.b.furuseth@usit.uio.no>
Subject: Re: ordered indexing for integers
From: Howard Chu <hyc@symas.com>
Date: Tue, 20 Nov 2007 15:54:13 -0800
Cc: OpenLDAP-devel@openldap.org
In-reply-to: <hbf.20071120vzux@bombur.uio.no>
References: <4742F5EE.5070100@symas.com> <C4750BA02ECE065B5884CDA0@[192.168.0.194]> <hbf.20071120u7oh@bombur.uio.no> <47431157.1020803@symas.com> <hbf.20071120vzux@bombur.uio.no>
User-agent: Mozilla/5.0 (X11; U; Linux i686; rv:1.9b2pre) Gecko/2007111122 SeaMonkey/2.0a1pre

Hallvard B Furuseth wrote:

Howard Chu writes:

I guess you're really describing something like floating-point
representation, except our exponents are always positive.


A mixture and floating point and DER, I suppose.  I've hardly looked
inside a floting-point number for decades.  Fixed length like you say
and "exponent" = "length it would have if represented exactly", as with
floating-point.  But with a lenght-of-length component and counted in
octets, not bits, as in DER.  And I don't see much point in doing the
base-10 multiplication or division needed for true floating point.
Unless floating-point hardware is sufficiently magical nowadays that
it's an advantage to make use of it.  I don't have a clue myself.

I don't know if anyone uses huge enough integers in LDAP that it's a
gain to avoid base10 conversion, but I don't know they don't either.
Probably a bad idea now that I think of it, in case one later moves to
an an LDAP/X.500-server with ASN.1 encoding, which will then have to
convert between binary and decimal anyway.


Right, better to stick with binary.

Another point with huge integers is that one might store a group of them
with small differences, and want an equality index to tell them apart.
For that an index of <length, first digits, last digits> could be
useful.  Then the inequality index and inequality filter for a huge
value would need to generate different index keys though.  (The filter
would need to replace the "last digits" part of huge values with
0x00... or 0xff...)  I don't know if OpenLDAP's indexing supports that.

Schemes like that will always manage to go wrong. No matter what lengths you choose, you'll always have situations where the keys are in the wrong order. E.g., 0x99xx0088: <x,99,88> and 0x99xx0277: <x,99,77>

In any case, for similar reasons it might be an idea to keep supporting
the old hash(decimal value) index format.

I tend to wonder if huge integers need to be accounted for here at all. I think it may be sufficient to just use <length, first N bytes> and probably N=4 is good enough. -- -- Howard Chu Chief Architect, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

References:
- ordered indexing for integers
  - From: Howard Chu <hyc@symas.com>
- Re: ordered indexing for integers
  - From: Quanah Gibson-Mount <quanah@zimbra.com>
- Re: ordered indexing for integers
  - From: Hallvard B Furuseth <h.b.furuseth@usit.uio.no>
- Re: ordered indexing for integers
  - From: Howard Chu <hyc@symas.com>
- Re: ordered indexing for integers
  - From: Hallvard B Furuseth <h.b.furuseth@usit.uio.no>

Prev by Date: Re: ordered indexing for integers
Next by Date: Re: ordered indexing for integers
Index(es):
- Chronological
- Thread