Search code examples
comstringbstr

Should there be a difference between an empty BSTR and a NULL BSTR?


When maintaining a COM interface should an empty BSTR be treated the same way as NULL? In other words should these two function calls produce the same result?

 // Empty BSTR
 CComBSTR empty(L""); // Or SysAllocString(L"")
 someObj->Foo(empty);

 // NULL BSTR
 someObj->Foo(NULL);     

Solution

  • Yes - a NULL BSTR is the same as an empty one. I remember we had all sorts of bugs that were uncovered when we switched from VS6 to 2003 - the CComBSTR class had a change to the default constructor that allocated it using NULL rather than an empty string. This happens when you for example treat a BSTR as a regular C style string and pass it to some function like strlen, or try to initialise a std::string with it.

    Eric Lippert discusses BSTR's in great detail in Eric's Complete Guide To BSTR Semantics:

    Let me list the differences first and then discuss each point in excruciating detail.

    1. A BSTR must have identical semantics for NULL and for "". A PWSZ frequently has different semantics for those.

    2. A BSTR must be allocated and freed with the SysAlloc* family of functions. A PWSZ can be an automatic-storage buffer from the stack or allocated with malloc, new, LocalAlloc or any other memory allocator.

    3. A BSTR is of fixed length. A PWSZ may be of any length, limited only by the amount of valid memory in its buffer.

    4. A BSTR always points to the first valid character in the buffer. A PWSZ may be a pointer to the middle or end of a string buffer.

    5. When allocating an n-byte BSTR you have room for n/2 wide characters. When you allocate n bytes for a PWSZ you can store n / 2 - 1 characters -- you have to leave room for the null.

    6. A BSTR may contain any Unicode data including the zero character. A PWSZ never contains the zero character except as an end-of-string marker. Both a BSTR and a PWSZ always have a zero character after their last valid character, but in a BSTR a valid character may be a zero character.

    7. A BSTR may actually contain an odd number of bytes -- it may be used for moving binary data around. A PWSZ is almost always an even number of bytes and used only for storing Unicode strings.