When maintaining a COM
interface should an empty BSTR
be treated the same way as NULL
?
In other words should these two function calls produce the same result?
// Empty BSTR
CComBSTR empty(L""); // Or SysAllocString(L"")
someObj->Foo(empty);
// NULL BSTR
someObj->Foo(NULL);
Yes - a NULL BSTR is the same as an empty one. I remember we had all sorts of bugs that were uncovered when we switched from VS6 to 2003 - the CComBSTR class had a change to the default constructor that allocated it using NULL rather than an empty string. This happens when you for example treat a BSTR as a regular C style string and pass it to some function like strlen
, or try to initialise a std::string
with it.
Eric Lippert discusses BSTR's in great detail in Eric's Complete Guide To BSTR Semantics:
Let me list the differences first and then discuss each point in excruciating detail.
A BSTR must have identical semantics for NULL and for "". A PWSZ frequently has different semantics for those.
A BSTR must be allocated and freed with the SysAlloc* family of functions. A PWSZ can be an automatic-storage buffer from the stack or allocated with malloc, new, LocalAlloc or any other memory allocator.
A BSTR is of fixed length. A PWSZ may be of any length, limited only by the amount of valid memory in its buffer.
A BSTR always points to the first valid character in the buffer. A PWSZ may be a pointer to the middle or end of a string buffer.
When allocating an n-byte BSTR you have room for n/2 wide characters. When you allocate n bytes for a PWSZ you can store n / 2 - 1 characters -- you have to leave room for the null.
A BSTR may contain any Unicode data including the zero character. A PWSZ never contains the zero character except as an end-of-string marker. Both a BSTR and a PWSZ always have a zero character after their last valid character, but in a BSTR a valid character may be a zero character.
A BSTR may actually contain an odd number of bytes -- it may be used for moving binary data around. A PWSZ is almost always an even number of bytes and used only for storing Unicode strings.