Search code examples
cencodinglocale

Wide character and Locale


#1

#include <stdio.h>
#include <locale.h>
#include <wchar.h>

int main()
{
    setlocale(LC_CTYPE,"C");
    wprintf(L"大\n");
    
    return 0;
}

//result : ?

#2

#include <stdio.h>
#include <locale.h>

int main()
{
    setlocale(LC_CTYPE,"C");
    printf("大\n");
    
    return 0;
}

//result : 大

The difference between #1 and #2 is just printing function.

I expect that if wide character doesnt printed in certain locale, then multibyte character also should not be printed in the same locale.

I'm curious why multibyte string is printed(#2), whereas wide character string doesnt printed(#1)?

I know if locale is not "C", wide character will be printed well. but why?? What is the locale exactly do?

+) I thought multibyte characer encoding is locale dependent, but multibyte character is printed well regradless of locale.. How computer can determine multibyte character encoding?


Solution

  • If you work with Windows Console you should use _setmode function to change the default translation mode of stdout to Unicode, if you want to work with wide strings.

    For example:

    #include <stdio.h>
    #include <wchar.h>
    #include <locale.h>
    #include <fcntl.h>
    #include <io.h>
    
    int main()
    {
        setlocale(LC_CTYPE,"C");
        _setmode(_fileno(stdout), _O_U16TEXT);
        wprintf(L"大\n");
        
        return 0;
    }
    

    https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setmode?view=msvc-170