index  prev  next

4 Character Sets

--------------------------------------------------------------------------------------

The new approach ended up with 4 different character sets in Samba. We rely on being able to convert between these character sets without loss of information.

While having 4 character sets seems complex, this system has the huge advantage that it is immediately clear what character set every string is in. Internally all char* strings are in "Unix Charset" and all smb_ucs2_t* strings are in UCS2. The other two character sets are only accessed via isolated IO functions.

--------------------------------------------------------------------------------------

CIFS2001 Seattle
tridge@valinux.com