I have an wide-character string (std::wstring) in my code, and I need to search wide character in it.
I use find() function for it:
wcin >> str;
wcout << ((str.find(L'ф') != wstring::npos)? L"EXIST":L"NONE");
L'ф'
is a Cyrillic letter.
But find() in same call always returns npos
. In a case with Latin letters find() works correctly.
It is a problem of this function? Or I incorrectly do something?
UPD
I use MinGW and save source in UTF-8.
I also set locale with setlocale(LC_ALL, "");
.
Code same wcout << L'ф';
works coorectly.
But same
wchar_t w;
wcin >> w;
wcout << w;
works incorrectly.
It is strange. Earlier I had no problems with the encoding, using setlocale ().
The encoding of your source file and the execution environment's encoding may be wildly different. C++ makes no guarantees about any of this. You can check this by outputting the hexadecimal value of your string literal:
std::wcout << std::hex << L"ф";
Before C++11, you could use non-ASCII characters in source code by using their hex values:
"\x05" "five"
C++11 adds the ability to specify their Unicode value, which in your case would be
L"\u03A6"
If you're going full C++11 (and your environment ensures these are encoded in UTF-*), you can use any of char
, char16_t
, or char32_t
, and do:
const char* phi_utf8 = "\u03A6";
const char16_t* phi_utf16 = u"\u03A6";
const char32_t* phi_utf16 = U"\u03A6";