[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: ordered indexing for integers

Hallvard B Furuseth wrote:
Howard Chu writes:
I guess you're really describing something like floating-point
representation, except our exponents are always positive.

A mixture and floating point and DER, I suppose. I've hardly looked inside a floting-point number for decades. Fixed length like you say and "exponent" = "length it would have if represented exactly", as with floating-point. But with a lenght-of-length component and counted in octets, not bits, as in DER. And I don't see much point in doing the base-10 multiplication or division needed for true floating point. Unless floating-point hardware is sufficiently magical nowadays that it's an advantage to make use of it. I don't have a clue myself.

I don't know if anyone uses huge enough integers in LDAP that it's a
gain to avoid base10 conversion, but I don't know they don't either.
Probably a bad idea now that I think of it, in case one later moves to
an an LDAP/X.500-server with ASN.1 encoding, which will then have to
convert between binary and decimal anyway.

Right, better to stick with binary.

Another point with huge integers is that one might store a group of them
with small differences, and want an equality index to tell them apart.
For that an index of <length, first digits, last digits> could be
useful.  Then the inequality index and inequality filter for a huge
value would need to generate different index keys though.  (The filter
would need to replace the "last digits" part of huge values with
0x00... or 0xff...)  I don't know if OpenLDAP's indexing supports that.

Schemes like that will always manage to go wrong. No matter what lengths you choose, you'll always have situations where the keys are in the wrong order. E.g., 0x99xx0088: <x,99,88> and 0x99xx0277: <x,99,77>

In any case, for similar reasons it might be an idea to keep supporting
the old hash(decimal value) index format.

I tend to wonder if huge integers need to be accounted for here at all. I think it may be sufficient to just use <length, first N bytes> and probably N=4 is good enough.
-- Howard Chu
Chief Architect, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/