Search code examples
c++cwinapiunicodebyte-order-mark

Simple reading file using ReadFile()


Why this code doesn't output anything(exept info word)? File is exist.

hReadFile = CreateFile(L"indexing.xml",GENERIC_READ | GENERIC_WRITE, FILE_SHARE_READ |FILE_SHARE_WRITE, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
    wchar_t *wchr = new wchar_t[20];
    DWORD dw;
    ReadFile(hReadFile, wchr, sizeof(wchar_t) * 5, &dw, NULL);
    CloseHandle(hReadFile);
    wchr[dw/sizeof(wchar_t)] = L'\0';
    std::wcout << L"info " << wchr << L"     " << dw << std::endl;

Solution

  • A Unicode file might start with an optional Byte Order Mark (BOM).

    For UTF-16 the BOM tells which endianess is used in the the file.

    Also the BOM can be used to destinguish between different Unicode encodings.

    The example file from the OP obviously carries such a BOM as its first two bytes, as increasing the pointer (to the 2-byte sized wchar_t typed array) skips it and lets the data be printed.

    std::wcout << L"info " << (wchr+1) << L" " << dw << std::endl;