Search code examples
ccharacter-encodinginternationalizationescapingshift-jis

How does one use shift sequences to output a character from another character set


Reading about how to use shift sequences to print characters from other character sets I've arrived at the following code (of which I'm sure the escape sequence is incorrect, however I do not know why):

#include <stdio.h>

int main(int argc, char *argv[])
{
    printf("\x1B\x28\x49\x0E\xB3"); /* Should print: ウ */
    return 0;
}

This however is not working for me as it outputs a "?" in the terminal rather than the character "ウ". My font does indeed have support for the character. If someone could explain what I'm doing incorrectly and how I would go about correcting this(still using shift sequences), that would be greatly appreciated.

Thank you


Solution

  • Your are using ISO-2022-JP-3. Hence you need to write your program as follows:

    int main ()
    {
        // switch to JIS X 0201-1976 Kana set (1 byte per character)
        printf ("\x1B(I");
    
        printf ("\x33"); /* ウ */
    
        // mandatory switch back to ASCII before end of line
        printf ("\x1B(B");
    
        printf ("\n");
    
        return 0;
    }
    

    Note however that it is unlikely to be the character set expected by the terminal (on linux, this is most likely UTF-8). You can use iconv to perform the conversion:

    $ ./main | iconv -f ISO-2022-JP-3
    

    Alternatively you can use iconv(3) to perform the conversion inside your program.