Search code examples
iconv

Converting a UTF-8 greek letters to ASCII using ASCII//IGNORE//TRANSLIT


I am using Linux terminal

echo -n "ΒΓΔΕΖΗΘΙΚΛΜΝΞΠΡΣΤΥΦΧΨΩ" | iconv -f utf-8 -t ASCII//IGNORE//TRANSLIT

but error: ilegal sequence.

Expected result: something as Α=A, Γ=G, Δ=D, Ε=E, Λ=L, etc.

PS: similar to PHP's problem here but utf8 not accepted on terminal.


Solution

  • First of all, it should be echo -n "ΒΓΔΕΖΗΘΙΚΛΜΝΞΠΡΣΤΥΦΧΨΩ"|iconv -f UTF-8 -t ASCII//TRANSLIT. Note the capial UTF-8 and no IGNORE.

    Second, there was a Greek transliteration table missing in glibc until recently. (iconv is part of glibc) [1]

    If that bug still affects you there should be ????????????? given as output, but no error.

    Anyway it is fixed for glibc 2.31 that is released on 2020-02-01 [2]

    If you really need this before the new glibc makes into your distributive you could patch the glibc/locale/C-translit.h.in and recompile.

    [1] https://sourceware.org/bugzilla/show_bug.cgi?id=12031

    [2] https://sourceware.org/legacy-ml/libc-announce/2020/msg00001.html


    NOTES

    • UBUNTU and many other Linux use glibc instead libc
    • on UBUNTU you can check version with the command ldd --version or dpkg -l libc6