Search code examples
c++unicodefarsi

Right to left isolate string. C++


does anyone have experience in Unicodes?

I am facing a tough problem with Farsi unicodes.
I have an std::wstring s = (L"\u0634\u0646\u0628\u0647"); which is a Farsi word. When I debug it, I see that the underlying word is exactly what I want, but reversed. So I have researched and found that u2067 is for right to left reading the string.
NOTE:

I cannot reverse the string manually because Farsi characters are changing their shape regardless of their position in the string.

So I added the 2067 int the beginning and got
std::wstring s = (L"\u2067\u0634\u0646\u0628\u0647");.
But now the underlying string is the same , just added a square in the beginning if the string instead of reversing.
Does anyone have experince with this stuff? Please suggest a solution. Thanks!


Solution

  • The underlying string will be the same. You haven't changed the order of bytes, which is written right there in the code. But a renderer that understands Unicode should take those bytes and display the characters right-to-left. That's a visual thing. It has nothing to do with the encoding. From your question, it's not entirely clear what else you expected. It may be that you are viewing the string in a debugger, and the debugger does not support this feature of Unicode. If you try outputting the string to a proper console you ought to see it as you expect.