Search code examples
clinuxwidechar

Impossible to put stdout in wide char mode


On my system, a pretty normal Ubuntu 13.10, the french accented characters "éèàçù..." are always handled correctly by whatever tools I use, despite LC_ environment variables being set to en_US.UTF-8. In particular command line utilities like grep, cat, ... always read and print these characters without a hitch.

Despite these remarks, such a small program as

int main() {
  printf("%c", getchar());
  return 0;
}

fails when the user enters "é".

From the man pages, and a lot of googling, there is no standard way to close stdout, then reopening it. From man fwide(), if stdout is in byte mode, I can't pass it to wide character mode, short of closing it and reopening it... therefore I can't use getwchar() and wprintf().

I can't believe that every single utility like cat, grep, etc... reimplements a way to manage wide characters, yet from my research, I see no other way.

Is it my system that has a problem? I can't see how since every utility works flawlessly. What am I missing, please?


Solution

  • When a C program starts, stdout, stdin and stderr are neither byte nor wide-character oriented. fwide(stdin, 0) should return 0 at this point.

    If you expand your minimal program to:

    #include <stdio.h>
    #include <locale.h>
    #include <wchar.h>
    
    int main()
    {
            setlocale(LC_ALL, "");
            printf("%lc\n", getwchar());
            return 0;
    }
    

    Then it should work as you expect. (There is no need to explicitly set the orientation of stdin here - since the first operation on it is a wide-character operation, it will have wide-character orientation).

    You do need to use getwchar() instead of getchar() if you want to read a wide character with it, though.