[Date Prev][Date Next] [Chronological] [Thread] [Top]

RE: String conversions UTF8 <-> ISO-8859-1



Patrick Dreyer, SY-UCP writes:

>> I think we actually agree that we should provide some "help" 
>> to application developers who need to do "conversions".  I 
>> think we just disagree over the choice of the API mechanism 
>> to use to provide that "help".
> 
> In my opinion the provided "help" has to be template based to be type
> safe:

Template?  This is C, not C++.  I think the best we could do is to have
an API which handles user data as void* pointers.  Or if that doesn't
fit the model of the API we land on, it can take char* pointers as
before but it will be careful never to access the user data directly,
but only via functions provided by the application.  That way the
application can cast pointers to whatever datastructures it uses to
char* and pass that to the application.

> Taken the case of UTF-8 to ASCII-BSTR we not only have to do the
> conversion we also have a different type to return. The OpenLDAP library
> API now works with const char* or char*, which makes sense, but BSTR's
> are wchar_t* at the end. Additionaly, memory for a BSTR has to be
> (de)allocated with special OS functions.

OK... that means the API must provide functions to free() user data if
it works by replacing user data with malloced strings.  And it must be
careful never to retain a structure (e.g. an array of strings) where the
first part is converted data and the last part (maybe after an error
occurred) is user data.

> Thus, the best way to solve the conversion problem is the one you
> mentioned with having a second API doing all the conversion stuff and
> simply calling the LDAP API.

Unfortunately, I don't think such an API will be much help at all,
compared to just demanding that the application converts all the data
'by hand'.

>> If we go with callbacks, un/repacking of BER is exactly what 
>> we'll be doing.  If we just provide helpers, the application 
>> can do conversion where they normally do value extraction and 
>> hence avoid repacking.
>
> My opinion too. Again, with callbacks we are not able to support
> conversions like UTF-8 <-> ASCII-BSTR or UTF-8 <-> Unicode-ASCII,
> because, as e.g., the type returned by ldap_get_values() is char**.
> Thus, callbacks do not solve the problem at all it just has more string
> copies as a consequence.

Just that the return type is char** doesn't mean that it has to point to
char* data.  It could point to something else, and the application could
cast the char* pointer to its own type.  I agree that's ugly, but I still
think the advantage of callbacks outweighs the disadvantage, since they
require so little modifications to the application in most cases.

-- 
Hallvard