[Date Prev][Date Next]
Re: LMDB and text encoding
- To: Timur Kristóf <email@example.com>, firstname.lastname@example.org
- Subject: Re: LMDB and text encoding
- From: Howard Chu <email@example.com>
- Date: Mon, 02 Feb 2015 02:58:57 +0000
- In-reply-to: <CAFF-SiUrJKGvG_z5vKgn13KX6oSbWQmLDj0VqGXMsuzJT5JBEg@mail.gmail.com>
- References: <CAFF-SiUrJKGvG_z5vKgn13KX6oSbWQmLDj0VqGXMsuzJT5JBEg@mail.gmail.com>
- User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:37.0) Gecko/20100101 Firefox/37.0 SeaMonkey/2.34a1
Timur Kristóf wrote:
I've been talking to Howard about this and he suggested to post it to
this mailing list. There are two things that I recently noticed about
how LMDB works with various encodings and I think it's worth to
2. Path names
Functions like mdb_env_open, mdb_env_get_path, mdb_env_copy and the
likes accept a char* for path names. This is fine on most unixes where
char* is an UTF-8 string, but unfortunately, these functions call the
ANSI variants of the Windows API functions, making it impossible to
use Unicode path names with them.
I think we should switch to the widechar APIs instead, but that would
also mean changing the LMDB API to accept a wchar_t* parameter on
Windows instead of char*.
What do you guys think about all this?
I just had a look at how BDB handled this. As you can see they used a
TO_TSTRING macro to convert incoming pathnames from UTF8 to UTF16.
(And a FROM_TSTRING for the reverse, as well.)
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/