I tested the Unicode conversion with a UNICODE MFC dialog app, where I can input some Chinese in the edit box. After reading in the characters using
DDX_Text(pDX, IDC_EDIT1, m_strUnicode) UpdateDate(TRUE)
the m_pszdata
of m_strUnicode
shows "e0 65 2d 4e 1f 75 09 67". Then I used the following code to convert it to char*:
char *psText; psText = new char[dwMinSize]; WideCharToMultiByte (CP_OEMCP, NULL, m_strUnicode,-1, psText, dwMinSize, NULL, FALSE);
The psText
contains "ce de d6 d0 c9 fa d3 d0", nothing similar with the m_pszdata
of m_strUnicode. Would anyone please explain why it is like that?
ce de d6 d0 c9 fa d3 d0
is 无中生有
in GBK. You sure you're manipulating Unicode?
CP_OEMCP instructs the API to use the currently set default OEM codepage.
So my guess here is that you're on a Chinese PC with GBK as default codepage.
无中生有
in UTF16LE is e0 65 2d 4e 1f 75 09 67
so basically you are converting a UTF-16-LE string to GBK.