Search code examples
c++linuxsetlocale

Multiple calls to setlocale


I am trying to figure out how Unicode is supported in C++.

When I want to output multilingual text to console, I call std::setlocale. However I noticed that the result depends on prior calls to setlocale.

Consider the following example. If run without arguments it calls setlocale once, otherwise it makes a prior call to setlocale to get the value of current locale and restore it at the end of the function.

#include <iostream>
#include <locale>

using namespace std;

int main(int argc, char** argv)
{
    char *current_locale = 0;

    if (argc > 1) {
        current_locale = setlocale(LC_ALL, NULL);   
            wcout << L"Current output locale: " << current_locale << endl;
    }

    char* new_locale = setlocale(LC_ALL, "ru_RU.UTF8");
    if (! new_locale)
        wcout << L"failed to set new locale" << endl;
    else
        wcout << L"new locale: " << new_locale << endl;


    wcout << L"Привет!" << endl;

    if (current_locale) setlocale(LC_ALL, current_locale);
    return 0;
}

The output is different:

:~> ./check_locale 
new locale: ru_RU.UTF8
Привет!
:~> ./check_locale 1
Current output locale: C
new locale: ru_RU.UTF8
??????!

Is there something that setlocale(LC_ALL, NULL) does that needs to be taken care of in future setlocale calls?

The compiler is g++ 7.5.0 or clang++ 7.0.1. And the console is a linux console in a graphical terminal.

More details on the system config: OpenSUSE 15.1, linux 4.12, glibc 2.26, libstdc++6-10.2.1


Solution

  • Is there something that setlocale(LC_ALL, NULL) does that needs to be taken care of in future setlocale calls?

    No, setlocale(..., NULL) does not modify the current locale. The following code is fine:

    setlocale(LC_ALL, NULL);
    setlocale(LC_ALL, "ru_RU.UTF8");
    wprintf(L"Привет!\n");
    

    However the following code will fail:

    wprintf(L"anything"); // or even just `fwide(stdout, 1);`
    setlocale(LC_ALL, "ru_RU.UTF8");
    wprintf(L"Привет!\n");
    

    The problem is that stream has it's own locale that is determined at the point the stream orientation is changed to wide.

    // here stdout has no orientation and no locale associated with it
    wprintf(L"anything");
       // `stdout` stream orientation switches to wide stream
       // current locale is used - `stdout` has C locale
    
    setlocale(LC_ALL, "ru_RU.UTF8");
    wprintf(L"Привет!\n");
       // `stdout` is wide oriented
       // current locale is ru_RU.UTF-8
       // __but__ the locale of `stdout` is still C and cannot be changed!
    

    The only documentation I found of this gnu.org Stream and I18N emphasis mine:

    Since a stream is created in the unoriented state it has at that point no conversion associated with it. The conversion which will be used is determined by the LC_CTYPE category selected at the time the stream is oriented. If the locales are changed at the runtime this might produce surprising results unless one pays attention. This is just another good reason to orient the stream explicitly as soon as possible, perhaps with a call to fwide.

    You can:

    • Use separate locale for C++ stream and C FILE (see here):

    std::ios_base::sync_with_stdio(false);
    std::wcout.imbue(std::locale("ru_RU.utf8"));
    
    • Reopen stdout:

    wprintf(L""); // stdout has C locale
    char* new_locale = setlocale(LC_ALL, "ru_RU.UTF8");
    freopen("/dev/stdout", "w", stdout); // stdout has no stream orientation
    wprintf(L"Привет!\n"); // stdout is wide and ru_RU locale 
    
    • I think (untested) that in glibc you can even reopen stdout with explicit locale (see GNU opening streams):

    freopen("/dev/stdout", "w,css=ru_RU.UTF-8", stdout);
    std::wcout << L"Привет!\n"; // fine
    
    • In any case, try to set locale as soon as possible before doing anything else.