Search code examples
vbaparsingcom

Exactly what are these _B_var_Xxxxx and _B_str_Xxxxx members in the VBA COM libraries?


Imagine the following function call:

foo = UCase("bar")

I'm parsing this code, and determine that UCase is a function call; now I want to resolve that function call to the declaration of the function in the COM library it's defined in.

The idea is to implement a code inspection that determines when a Variant built-in function is used when a String-returning function exists, like here UCase$ could be used instead.

It seems the functions I'm after are declared as _B_var_Ucase and _B_str_UCase in the COM library; I'm picking up a UCase member in the VBA.Strings module, but it's returning VT_VOID - in other words it's a procedure, not a function.

I could hard-code some logic specific to that group of functions, so that when my resolver code encounters UCase I can concatenate _B_var_ (and for UCase$, I can concatenate _B_str_) to the identifier I'm trying to resolve, and if I'm lucky I'll get my resolver code to correctly assign the reference to the correct built-in function declaration.

But it's pure guesswork.

I know _var_ stands for Variant and _str_ stands for String, but the part I'm missing is exactly how to relate these hidden functions to, say, that UCase function call in the VBA code I'm parsing.

Is UCase("bar") a call to the VBA.Strings.UCase procedure? Then how does it work as a function? And if not, then how does VBA know to interpret the UCase token as a call to _B_var_UCase? Is there a consistent naming scheme I can rely on, or is there a relationship between UCase and _B_var_UCase that I'm not seeing? And what's the B for anyway?

The web is outrageously silent about the innards of the VBA standard library, I hope someone here might know something about it.


Solution

  • Take a look at the TYPEATTR structure, and in particular, the typekind and tdescAlias members.

    typedef struct tagTYPEATTR {
      GUID     guid;
      LCID     lcid;
      DWORD    dwReserved;
      MEMBERID memidConstructor;
      MEMBERID memidDestructor;
      LPOLESTR lpstrSchema;
      ULONG    cbSizeInstance;
      TYPEKIND typekind;
      WORD     cFuncs;
      WORD     cVars;
      WORD     cImplTypes;
      WORD     cbSizeVft;
      WORD     cbAlignment;
      WORD     wTypeFlags;
      WORD     wMajorVerNum;
      WORD     wMinorVerNum;
      TYPEDESC tdescAlias;
      IDLDESC  idldescType;
    } TYPEATTR, *LPTYPEATTR;
    

    tdescAlias

    If typekind is TKIND_ALIAS, specifies the type for which this type is an alias.

    So, UCase(String) is an Alias for _B_var_Ucase(String) (where the parameter and the return value are both implicitly Variant.

    And, UCase$(String As String) As String is an Alias for _B_str_UCase(String As String) As String

    You can call these underlying functions directly from VB/VBA, but you must use square brackets because _B_str_UCase and _B_var_UCase aren't valid identifiers in VB/VBA.

    Dim s As String
    Dim v As Variant
    
    s = [_B_str_UCase]("abc")
    v = [_B_var_UCase](Null)
    
    's = "ABC"
    'v = Null
    

    The outrageous silence of the web is tempered by the Wayback Machine's cache of an informative and amusing series of articles by Sean Baxter, but also preserved with formatting, here.

    The TKIND_ALIAS type kind indicates description of a typedef. The name of the typedef is the name of the type (acquired through GetDocumentation), and the alias type is TYPEATTR::tdescAlias.

    As for the B in _B_str_UCase and _B_var_UCase it's hard to say exactly, but it's worth remembering that, in COM, a string is actually a BSTR

    A BSTR (Basic string or binary string) is a string data type that is used by COM, Automation, and Interop functions

    This MSDN article goes further and states:

    COM code uses the BSTR to store a Unicode string, short for “Basic String”. (So called because this method of storing strings was developed for OLE Automation, which was at the time motivated by the development of the Visual Basic language engine.)

    and then goes on to say:

    If you write a function which takes an argument of type BSTR then you are required to accept NULL as a valid BSTR and treat it the same as a pointer to a zero-length BSTR. COM uses this convention, as do both Visual Basic and VBScript, so if you want to play well with others you have to obey this convention. If a string variable in VB happens to be an empty string then VB might pass it as NULL or as a zero-length buffer — it is entirely dependent on the internal workings of the VB program.

    So maybe the B is to remind the authors of those functions that the return type should be a BSTR, or maybe the B just means base function. I doubt it is important to anyone other than the original developers.

    EDIT

    OleView gives these details:

    module Strings{ 
        [entry(527), helpcontext(0x000f6eab)]
        BSTR _stdcall _B_str_UCase([in] BSTR String);
    
        [entry(528), helpcontext(0x000f659b)]
        VARIANT _stdcall _B_var_UCase([in] VARIANT* String);
    }
    
    interface _HiddenInterface {
        [restricted, helpcontext(0x000f659b)]
        void _stdcall UCase();
    }
    

    And the entries (527 and 528) can be found in the DLL exports:

    ordinal hint RVA      name
        ...
        527   C8 0005E8A0 rtcUpperCaseBstr
        528   C9 00070418 rtcUpperCaseVar
        ....