BSTR
is this weird Windows data type with a few specific uses, such as COM functions. According to MSDN, it contains a WCHAR
string and some other stuff, like a length descriptor. Windows is also nice enough to give us the _bstr_t
class, which encapsulates BSTR
; it takes care of the allocation and deallocation and gives you some extra functionality. It has four constructors, including one that takes in a char*
and one that takes in a wchar_t*
. MSDN's description of the former: "Constructs a _bstr_t
object by calling SysAllocString
to create a new BSTR
object and then encapsulates it. This constructor first performs a multibyte to Unicode conversion."
It also has operators that can extract a pointer to the string as any of char*
, const char*
, and wchar_t*
, and I'm pretty sure those are null-terminated, which is cool.
I've spent a while reading up on how to to convert between multibyte and Unicode, and I've seen a lot of talk about how to use mbstowcs
and wcstomb
, and how MultiByteToWideChar
and WideCharToMultiByte
are better because of encodings may differ, and blah blah blah. It all kind of seems like a headache, so I'm wondering whether I can just construct a _bstr_t
and use the operations to access the strings, which would be... a lot fewer lines of code:
char* multi = "asdf";
_bstr_t bs = _bstr_t(mb);
wchar_t* wide = (wchar_t*)bs; // assume read-only
I guess my intuitive answer to this is that we don't know what Windows is doing behind the scenes, so if I have a problem using mbstowcs
/wcstomb
(I guess I really mean mbstowcs_s
/wcstomb_s
) rather than MultiByteToWideChar
/WideCharToMultiByte
, I shouldn't risk it because it's possible that Windows uses those. (It's almost certainly not using the latter, since I'm not specifying a "code page" here, whatever that is.) Honestly I'm not sure yet whether I consider the mbstowcs_s
and wcstomb_s
functions OK for my purposes, because I don't really have a grasp on all of the different encodings and stuff, but that's a whole different question and it seems to be addressed all over the Internet.
Sooooo, is there anything wrong with doing this, aside from that potential concern?
Using _bstr_t::_bstr_t(const char*)
is not exactly a good idea in production code:
Constructs a
_bstr_t
object by callingSysAllocString
to create a newBSTR
object and encapsulate it. This constructor first performs a multibyte to Unicode conversion. Ifs2
is too large, you [sic] may generate a stack overflow error. In such a situation, convert yourchar*
to awchar_t
withMultiByteToWideChar
and then call thewchar_t *
constructor.
Besides that _bstr_t::operator wchar_t*() const throw()
seems barely useful. It's just for struct member extraction, so you're constrained to a const
:
These operators can be used to extract raw pointers to the encapsulated Unicode or multibyte BSTR object. The operators return the pointer to the actual internal buffer, so the resulting string cannot be modified.
So _bstr_t
is just a helper object for encapsulating BSTR
s, and a mediocre one at that. Conversion using MultiByteToWideChar
and WideCharToMultiByte
is a much better choice, for multiple reasons:
const
buffer in return, because you provide your own.