[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Chinese character access



Giovanni Baruzzi wrote:

> UTF-8 uses 1 Byte for all ASCII Characters, 2 Bytes sequences for accented
> characters and 3 (4?)

No; not 4.  3 bytes are sufficient to represent a character in Unicode
(U+0000 - U+FFFF).  UTF-8 is defined by RFC 2279.

> characters sequences for more complex characters. I hope you will find the
> UTF-8 encoding of chinese charactrs, I have no idea where it can be....

The Unicode Standard, published by the Unicode Consortium
<http://www.unicode.org/>.