Search code examples
winapiencodingcharacter-encodinghunspell

How do I switch from UTF-8 char * to a dynamic encoding with Win32 API?


I'm currently working on a project that uses Hunspell inside Node. The goal is a cross-platform spell-checking that works with encoding properly (node-spellchecker).

I have to use arbitrary dictionaries which have different encodings. Most have SET UTF-8 in the *.aff file but other dictionaries have encodings like SET ISO8859-1. I get UTF-8 from Node but I need to convert it into the encoding for the dictionary. Then, I need to convert it in the reverse to handle suggestions.

With Linux, I can use iconv to convert it but I don't have that on the Windows side of things. However, I'd like not to require UTF-8 dictionaries (that works).

Any suggestion or hints of where to start would be greatly appreciated. WideCharToMultiByte is used in one step, but I couldn't find a MultiByteToMultiByte that I would expect.

Things I Have

const char *from_encoding_name = "UTF-8"; // This can be swapped
const char *to_encoding_name = "ISO8859-1"; // This can be swapped
const char *word = /* möchtzn encoded in UTF-8 */;

Things I Want

const char *dictionaryWord = /* möchtzn encoded in ISO-8859-1 */;

Thank you.


Solution

  • I don't think that analog MultiByteToMultiByte exists in WinAPI. I'd use two calls: MultiByteToWideChar and then WideCharToMultiByte.

    BTW, I looked into sources of .Net method Encoding.Convert and there is also conversion is done through UTF-16.