On my system, a pretty normal Ubuntu 13.10, the french accented characters "éèàçù..." are always handled correctly by whatever tools I use, despite LC_ environment variables being set to en_US.UTF-8. In particular command line utilities like grep, cat, ... always read and print these characters without a hitch.
Despite these remarks, such a small program as
int main() {
printf("%c", getchar());
return 0;
}
fails when the user enters "é".
From the man pages, and a lot of googling, there is no standard way to close stdout, then reopening it. From man fwide(), if stdout is in byte mode, I can't pass it to wide character mode, short of closing it and reopening it... therefore I can't use getwchar() and wprintf().
I can't believe that every single utility like cat, grep, etc... reimplements a way to manage wide characters, yet from my research, I see no other way.
Is it my system that has a problem? I can't see how since every utility works flawlessly. What am I missing, please?
When a C program starts, stdout
, stdin
and stderr
are neither byte nor wide-character oriented. fwide(stdin, 0)
should return 0 at this point.
If you expand your minimal program to:
#include <stdio.h>
#include <locale.h>
#include <wchar.h>
int main()
{
setlocale(LC_ALL, "");
printf("%lc\n", getwchar());
return 0;
}
Then it should work as you expect. (There is no need to explicitly set the orientation of stdin
here - since the first operation on it is a wide-character operation, it will have wide-character orientation).
You do need to use getwchar()
instead of getchar()
if you want to read a wide character with it, though.