I want to use the buffer from a UNICODE_STRING
, but it seems I cannot just directly use it, by copying reference, because sometime I can see that there are null bytes in the middle of a string, and Length
is greater than what I see in the debugger. So if I do this
UNICODE_STRING testStr;
//after being used by some function it has data like this 'bad丣\0more_stuff\0'
wchar_t * wStr = testStr.Buffer;
I will end up with wStr = "bad丣";
Is there a way to convert this to the null terminated, valid wchar_t*
?
A wchar_t*
is just a pointer. Unless you tell the debugger (or any function you pass the wchar_t*
to) exactly how many wchar_t
characters are actually being pointed at, it has to stop somewhere, so it stops on the first null character it encounters.
UNICODE_STRING::Buffer
is not guaranteed to be null-terminated, but it can contain embedded nulls. You have to use the UNICODE_STRING::Length
field to know how many WCHAR
elements are in the Buffer
, including embedded nulls but not counting a trailing null terminator if one is present. If you need a null terminator, copy the Buffer
data to your own buffer and append a terminator.
The easiest way to do that is to use std::wstring
, eg:
#include <string>
UNICODE_STRING testStr;
// fill testStr as needed...
std::wstring wStrBuf(testStr.Buffer, testStr.Length / sizeof(WCHAR));
const wchar_t *wStr = wStrBuf.c_str();
The embedded nulls will still be present, but c_str()
will append the trailing null terminator for you. The debugger will still display the data up to the first null only, unless you tell the debugger the actual number of WCHAR
elements are in the data.
Alternatively, if you know the Buffer
data contains multiple substrings separated by nulls, you could optionally split the Buffer
data into an array of strings instead, eg:
#include <string>
#include <vector>
UNICODE_STRING testStr;
// fill testStr as needed...
std::vector<std::wstring> wStrArr;
std::wstring wStr(testStr.Buffer, testStr.Length / sizeof(WCHAR));
std::wstring::size_type startidx = 0;
do
{
std::wstring::size_type idx = wStr.find(L'\0', startidx);
if (idx == std::wstring::npos)
{
if (startidx < wStr.size())
{
if (startidx > 0)
wStrArr.push_back(wStr.substr(startidx));
else
wStrArr.push_back(wStr);
}
break;
}
wStrArr.push_back(wStr.substr(startidx, idx-startidx));
startidx = idx + 1;
}
while (true);
// use wStrArr as needed...
Or:
#include <vector>
#include <algorithm>
UNICODE_STRING testStr;
// fill testStr as needed...
std::vector<std::wstring> wStrArr;
WCHAR *pStart = testStr.Buffer;
WCHAR *pEnd = pStart + (testStr.Length / sizeof(WCHAR));
do
{
WCHAR *pFound = std::find(pStart, pEnd, L'\0');
if (pFound == pEnd)
{
if (pStart < pEnd)
wStrArr.push_back(std::wstring(pStart, pEnd-pStart));
break;
}
wStrArr.push_back(std::wstring(pStart, pFound-pStart));
pStart = pFound + 1;
}
while (true);
// use wStrArr as needed...