Search code examples
c++unicodec++17visual-studio-2019wstring

What is the best way to output Unicode to console?


I am working with C++17 in Visual Studio 2019. I have read a fair bit about encodings but I am still not very comfortable with them. I want to output UNICODE characters to screen. For that, I am using the following code

#include <iostream>
#include <fcntl.h>
#include <io.h>

std::wstring symbol{ L"♚" };

_setmode(_fileno(stdout), _O_WTEXT);
std::wcout << symbol; //This works fine
std::cout << "Hello"; //This gives "Debug Assertion Failed! Expression: buffer_size % 2 == 0"
_setmode(_fileno(stdout), O_TEXT); //Need this to make std::cout work normally
std::cout << "World"; //This works fine

So I could do setmode to _O_WTEXT and then back to O_TEXT everytime I need to output the std::wstring. However, I am worried this may be an inefficient way to do things. Is there a better way to do it? I have read about something called native widechar support in C++ but I found it hard to understand. Could anyone illuminate me?

EDIT

To add to the above, using _setmode(_fileno(stdout),_O_U16TEXT) leads to the same behaviour as described above when trying to use std::cout without setting the mode back. If I use _setmode(_fileno(stdout),_O_U8TEXT) instead, my code fails to compile and gives errors 'symbol': redefinition, different basic types and '<<': illegal for class when using std::cout on std::string symbol = <insert any of the possibilities I tried in the snippet below>.

I have been suggested to use a UTF-8 std::string to be able to use std::cout and that way avoid having to switch to wide mode. Could anyone give me a hand on how to achieve this? I have tried

std::string symbol = "\u265A"; //using std::cout gives "?" and triggers warning *
std::string symbol = "♚"; //Same as above

std::string symbol = u8"\u265A"; //using std::cout gives "ÔÖÜ"
std::string symbol = u8"♚"; //Same as above

*Severity Code Description Project File Line Suppression State Warning C4566 character represented by universal-character-name '\u265A' cannot be represented in the current code page (1252)

I have read it may be possible to convert from std::wstring to UTF-8 std::string using WideCharToMultiByte() from the header <Windows.h>. Would that work? Could anyone offer any help?


Solution

  • The clue is in the error message. "...cannot be represented in the current code page (1252)". So the code page needs to be changed. The code page identifier for UTF-8 is 65001. To change the code page, use SetConsoleOutputCP.