[Date Prev][Date Next] [Chronological] [Thread] [Top]

Fwd: LMDB and text encoding



On Mon, Feb 2, 2015 at 3:37 AM, Howard Chu <hyc@symas.com> wrote:
> Hallvard Breien Furuseth wrote:
>>
>> On 02/02/15 00:40, Howard Chu wrote:
>>>
>>> It looks OK to me. No one raises any concerns I'll commit it in a few
>>> hours.
>>
>>
>> Some sudden last thoughts:
>>
>> mdb_dump.c also has a check (memchr(key.mv_data, '\0', key.mv_size)
>> to exclude non-databases, which is no longer valid.
>
>
> Good point. As Timur's patch comment notes, we probably need an API call "is
> valid DB" now.
>
>> Database names with \0 in them can no longer be spelled as strings,
>> everything which gets DB names from the database must use binary blobs.
>> Including mdb_load and mdb_dump; I notice mdb_load uses
>> strdup() for the "database=" name.  Come to think of it, I have no
>> idea if the dump format supports DB names with \0 in them.
>
>
> No, it doesn't. It's the BDB format, and BDB only accepted C strings.

(Just noticed that I hit "reply" instead of "reply all". Sorry. Now
reposting to the mailing list.)

I think it is an acceptable limitation of mdb_dump and mdb_load. This
is not the only thing they don't support: they also don't work with
user-defined comparison functions. Although I could think about ways
to solve it.

For example, we could add a command line option that would make
mdb_dump output db names as a string of hexadecimal numbers, and
mdb_load interpret them as such.