Search code examples
c#.netclrvtable

CLR implementation of virtual method calls via pointer to base class


I can not sleep until I know how occur recalling the desired virtual method by C#/CLR. In my book richter wrote no more no less than what CLR determine the actual type of the object and call the appropriate method. For example, in C + +, each instance of the polymorphic class stores a pointer to a virtual table. But in C# instead of a pointer to virtual table with instance data stored some SyncBlk Index and TypeHandle. Do not understand how TypeHandle different from this pointer in C#. What is role of TypeHandle. For example in C++ we have

class A
{
   int a;
   public virtual void show() {}
};

class B: public A
{
   int b;
   public virtual void show() {}
};

How instance of A and B classes looks in memory, i write in peseudocode

A:
{
  vtptr;   // pointer to  A vt
  a;
}


B: 
{
  vtptr;  // pointer to  A vt + B vt
  a;
  b;
}

So then in C++ we run code

A* pa = new B(); 
pa->show();  

It is clear that we creating B instance and cast him to A type, but we don't lose overridden address of show() and thank for that we can call B::show(). I really need to understand for a similar example of how C#/CLR performs casts to a base type and defines the virtual methods calls. Please help! I will be glad to know all technical details


Solution

  • "TypeHandle" is a popular misnomer. It is actually the MethodTable pointer, the way it is named in the CLR source. The MethodTable of a managed object is functionally very similar to a C++ v-table, a table of pointers to the methods of the class. A small difference is that it doesn't just contain the virtual methods, the table is also used to just-in-time compile a method.

    So it just works the exact same way, a simple indirect call against the entries in that table. The jitter knows the offset of the method pointer in the table, just like the C++ compiler does. This runs exactly as fast as native C++ code.

    Your snippet written in C# and used like this:

        A obj = new B();
        obj.show();
    

    Produces this 32-bit machine code at runtime:

    00000003  mov         ecx,51B034Ch            ; typeref for class B           
    00000008  call        FBB90BF4                ; call operator new
    
    0000000d  mov         ecx,eax                 ; setup this pointer
    0000000f  mov         eax,dword ptr [ecx]     ; obtain methodtable pointer from object
    00000011  call        dword ptr [eax+38h]     ; indirect call to show()
    

    As you can tell from the offset (0x38), the method table contains more than just the method pointers. You can find details about it in the SSCLI20 source code.

    The snippet in native C++ and used like this:

    int main()
    {
        A* obj = new B;
        obj->show();
        return 0;
    }
    

    Produces this 32-bit machine code:

    01361010  push        0Ch                     ; size of B object 
    01361012  call        operator new (136104Ch) ; call operator new
    01361017  add         esp,4                   ; __cdecl stack cleanup
    0136101A  test        eax,eax                 ; handle null pointer
    0136101C  je          main+1Fh (136102Fh) 
    0136101E  mov         dword ptr [eax],offset B::`vftable' (1362100h)  ; initialize v-table 
    
    01361024  mov         edx,dword ptr [eax]     ; obtain v-table pointer from object
    01361026  mov         ecx,eax                 ; setup this pointer
    01361028  mov         eax,dword ptr [edx]     ; get method pointer
    0136102A  call        eax                     ; indirect call to show()
    

    The constructor call is a bit more elaborate due to C++ language rules. The method call is the same, just different code-gen. Slightly more efficient on early generation Pentium processors.