Search code examples
visual-c++assemblyvtable

Confusion on assembly output of virtual table in Visual C++ 2015


I'm confused by the assembly output of Visual C++ 2015 (x86).

I want to know the virtual table layout in VC, so I write the following simple class with a virtual function.

#include <stdio.h>

struct Foo
{
    virtual int GetValue()
    {
        uintptr_t vtbl = *(uintptr_t *)this;
        uintptr_t slot0 = ((uintptr_t *)vtbl)[0];
        uintptr_t slot1 = ((uintptr_t *)vtbl)[1];

        printf("vtbl = 0x%08X\n", vtbl);
        printf("  [0] = 0x%08X\n", slot0);
        printf("  [1] = 0x%08X\n", slot1);

        return 0xA11BABA;
    }
};

extern "C" void Check();

int main()
{
    Foo *pFoo = new Foo;
    int x = pFoo->GetValue();
    printf("x = 0x%08X\n", x);
    printf("\n");
    Check();
}

And to check the layout, I implement an assembly function (the magic name comes from the assembly output vtab.asm of vtab.cpp, and is the mangled version of Foo::GetValue).

.model flat

extern _printf : proc
extern ?GetValue@Foo@@UAEHXZ : proc

.const
FUNC_ADDR db "Address of Foo::GetValue = 0x%08X", 10, 0

.code
_Check proc
    push ebp
    mov esp, ebp

    push offset ?GetValue@Foo@@UAEHXZ
    push offset FUNC_ADDR
    call _printf
    add esp, 8

    pop ebp
    ret
_Check endp
end

Then, I compile and run.

ml /c check.asm
cl /Fa vtab.cpp check.obj
vtab

And get the following output on my computer.

vtbl = 0x00FF2174
  [0] = 0x00FE1300
  [1] = 0x6C627476
x = 0x0A11BABA

Address of Foo::GetValue = 0x00FE1300

It clearly shows the virtual function GetValue is at offset 0 of the virtual table. But the assembly output of vtab.cpp seems to imply GetValue is at offset 4 (see the following comments start with three semicolons).

;   COMDAT ??_7Foo@@6B@
CONST   SEGMENT
??_7Foo@@6B@ DD FLAT:??_R4Foo@@6B@          ; Foo::`vftable'
    DD  FLAT:?GetValue@Foo@@UAEHXZ         ;;; GetValue at offset 4
CONST   ENDS

; Function compile flags: /Odtp
;   COMDAT ??0Foo@@QAE@XZ
_TEXT   SEGMENT
_this$ = -4                     ; size = 4
??0Foo@@QAE@XZ PROC                 ; Foo::Foo, COMDAT
; _this$ = ecx
    push    ebp
    mov ebp, esp
    push    ecx
    mov DWORD PTR _this$[ebp], ecx
    mov eax, DWORD PTR _this$[ebp]
    mov DWORD PTR [eax], OFFSET ??_7Foo@@6B@    ;;; Init ptr to virtual table
    mov eax, DWORD PTR _this$[ebp]
    mov esp, ebp
    pop ebp
    ret 0
??0Foo@@QAE@XZ ENDP                 ; Foo::Foo

Thanks for your answering!

Update

@Hans Passant This seems to be a bug. I ml /c the assembly output vtab.asm (with a few symbols deletion) and link it with check.obj to get an exe vtab2.exe. But vtab2.exe won't run correctly. Then I modify the following code

;   COMDAT ??_7Foo@@6B@
CONST   SEGMENT
??_7Foo@@6B@ DD FLAT:??_R4Foo@@6B@          ; Foo::`vftable'
    DD  FLAT:?GetValue@Foo@@UAEHXZ
CONST   ENDS

to

;   COMDAT ??_7Foo@@6B@
CONST   SEGMENT
__NOT_USED_ DD  FLAT:??_R4Foo@@6B@          ; Foo::`vftable'
??_7Foo@@6B@    DD  FLAT:?GetValue@Foo@@UAEHXZ
CONST   ENDS

and ml and link again to get vtab3.exe. Now vtab3.exe runs correctly and produces an output similar to vtab.exe.


Solution

  • I don't think Microsoft would consider this a bug. Yes, the assembly output should have the vtable symbol on the second element of the vtable so that the RTTI entry appears at offset -4 of the table. However the table should also be in a COMDAT section, but instead there's only a comment in the assembly output (; COMDAT) that indicates this. That's because while the PECOFF object file format supports COMDAT sections, the assembler (MASM, invoked as ml) doesn't. There's no way for the compiler to generate an assembly file that actually corresponds to the contents of the object file it creates.

    Or to put it another way, the assembly output isn't meant to be assembled. It's just meant to be informative. Even with your fix applied the assembly output doesn't generate the same object file the compiler does. If you did this in a more realistic project where Foo was used in more than one object file you'd get multiple definition errors when linking. If you want to see the real output of the compiler you need to look at the object file.

    For example if you use dumpbin /all vtab.obj and go through its output, you'll see something like:

    SECTION HEADER #C
      .rdata name
    ...
    40301040 flags
             Initialized Data
             COMDAT; sym= "const Foo::`vftable'" (??_7Foo@@6B@)
             4 byte align
             Read Only
    
    RAW DATA #C
      00000000: 00 00 00 00 00 00 00 00                          ........
    
    RELOCATIONS #C
                                                    Symbol    Symbol
     Offset    Type              Applied To         Index     Name
     --------  ----------------  -----------------  --------  ------
     00000000  DIR32                      00000000        34  ??_R4Foo@@6B@ (const Foo::`RTTI Complete Object Locator')
     00000004  DIR32                      00000000        1F  ?GetValue@Foo@@UAEHXZ (public: virtual int __thiscall Foo::GetValue(void))
    
    ...
    
    COFF SYMBOL TABLE
    ...
    026 00000000 SECTC  notype       Static       | .rdata
        Section length    8, #relocs    2, #linenums    0, checksum        0, selection    6 (pick largest)
    028 00000004 SECTC  notype       External     | ??_7Foo@@6B@ (const Foo::`vftable')
    

    It's not easy to understand, but all the information about the actual layout of the vtable is given. The symbol for the vtable, ??_7Foo@@6B@ (const Foo::`vftable'), is at offset 00000004 of SECTC or section number 0xC. Section #C is 8 bytes long and has relocations for the RTTI locator and Foo::GetValue that are applied at offsets 00000000 and 00000004 of the section. So you can see that in the object file the vtable symbol does in fact point to the entry containing the pointer to the first virtual method.

    Open Watcom has a utility that can show you the contents of an object file in a more assembly-like fashion, though notably not in the syntax that MASM uses. Running wdis t279.obj shows:

                    .new_section .rdata, "dr2"
    0000    00 00 00 00                                     .long   ??_R4Foo@@6B@
    0004                          ??_7Foo@@6B@:
    0004    00 00 00 00                                     .long   ?GetValue@Foo@@UAEHXZ