Search code examples
command-lineutf-8hexoctal

How do I convert among printed decimal/octal/hex/UTF-8 representations of a UTF-8 character from the command-line?


In another question someone suggested echo -e with \0<sequence> for octal, and \x<sequence> for hex. E.g.:

echo -e "\\0302\\0241" --> ¡

Is there a simple way to convert in the other direction, from UTF-8 character to printed octal/hex sequence?


Solution

  • Yep - use hexdump, like this:

    $ echo -n i | hexdump
    

    Which will output something like this:

    0000000 0069                              
    0000003
    

    For something more formatted, you could do this:

    $ echo ü | hexdump | awk '{print "\\x"toupper(substr($2,3,4)) "\\x"toupper(substr($2,0,2)) "\\x"toupper(substr($3,3,4))}' | head -1
    

    which will print out this:

    \xC3\xBC\x0A
    

    Code taken from here: How do you echo a 4-digit Unicode character in Bash?