Search code examples
c++utf-8character-encodingiconvwindows-1251

Is there transliteration from UTF-8 to CP1251 when one symbol substitutes with several symbols?


I use function iconv with option translit.

Is there transliteration from UTF-8 to CP1251 when one symbol substitutes with several symbols? Where I can search for that information? I am using iconv.


Solution

  • There are some, depending on the implementation and locale:

    $ echo '℀⇒½' | iconv -f UTF8 -t CP1251//TRANSLIT
    a/c=> 1/2 
    

    These are, respectively, U+2100 ACCOUNT OF transliterated as a/c, U+21D2 RIGHTWARDS DOUBLE ARROW transliterated as =>, U+00BDVULGAR FRACTION ONE HALF transliterated as 1/2 (including spaces).

    I found these in the GNU libc source code, https://github.com/lattera/glibc/blob/master/locale/C-translit.h.in; different implementations may not transliterate these characters the same way if at all.