I was trying to learn some more about the inner workings of vtables and vpointers, so I decided to try to access the vtable directly using some tricks. I created two classes, Base
and Derv
, each having two virtual
functions (Derv
overriding those of Base
).
class Base
{
int x;
int y;
public:
Base(int x_, int y_) : x(x_), y(y_) {}
virtual void foo() { cout << "Base::foo(): x = " << x << '\n'; }
virtual void bar() { cout << "Base::bar(): y = " << y << '\n'; }
};
class Derv: public Base
{
int x;
int y;
public:
Derv(int x_, int y_) : Base(x_, y_), x(x_), y(y_) {}
virtual void foo() { cout << "Derived::foo(): x = " << x << '\n'; }
virtual void bar() { cout << "Derived::bar(): y = " << y << '\n'; }
};
Now, the compiler adds a vtable pointer to each class, occupying the first 4 bytes (32 bits) in memory. I accessed this pointer by casting the address of an object to a size_t*
, since the pointer points to another pointer of size sizeof(size_t)
. The virtual functions can now be accessed by indexing the vpointer, and casting the result to a function pointer of the appropriate type. I encapsulated these steps in a function:
template <typename T>
void call(T *ptr, size_t num)
{
typedef void (*FunPtr)();
size_t *vptr = *reinterpret_cast<size_t**>(ptr);
FunPtr fun = reinterpret_cast<FunPtr>(vptr[num]);
//setThisPtr(ptr); added later, see below!
fun();
}
When one of the memberfunctions are called this way, e.g. call(new Base(1, 2), 0)
to call Base::foo(), it is hard to predict what will happen, since they are called without a this
-pointer. I solved this by adding a little templatized function, knowing that g++ stores the this
-pointer in the ecx
register (this however forces me to compile with the -m32
compiler flag):
template <typename T>
void setThisPtr(T *ptr)
{
asm ( mov %0, %%ecx;" :: "r" (ptr) );
}
Uncommenting the setThisPtr(ptr)
line in the snippet above now makes it a working program:
int main()
{
Base* base = new Base(1, 2);
Base* derv = new Derv(3, 4);
call(base, 0); // "Base::foo(): x = 1"
call(base, 1); // "Base::bar(): y = 2"
call(derv, 0); // "Derv::foo(): x = 3"
call(derv, 1); // "Derv::bar(): y = 4"
}
I decided to share this, since in the process of writing this little program I gained more insight in how vtables work and it might help others in understanding this material a little better.
However I still have some questions:
1. Which register is used (gcc 4.x) to store the this-pointer when compiling a 64-bit binary? I tried all 64-bit registers as documented here: http://developers.sun.com/solaris/articles/asmregs.html
2. When/how is the this-pointer set? I suspect that the compiler sets the this pointer on each function call through an object in a similar way as to how I just did it. Is this the way polymorphism actually works? (By setting the this-pointer first, then calling the virtual function from the vtable?).
On Linux x86_64, and I believe other UNIX-like OSes, function calls follow the System V ABI (AMD64), which itself follows the IA-64 C++ ABI for C++. Depending on the method's type, the this
pointer is either passed implicitly through first argument or the second argument (when the return value has non-trivial copy constructor or destructor, it must live as a temporary on stack, and the first argument is implicitly a pointer to that space); otherwise, virtual method calls are identical to function calls in C (integer/pointer arguments in %rdi
, %rsi
, %rdx
, %rcx
, %r8
, %r9
, overflowing to stack; integer/pointer return in %rax
; floats in %xmm0
-%xmm7
; etc.). Virtual method dispatch works by looking up a pointer in the vtable then calling it just like a non-virtual method.
I'm less familiar with Windows x64 conventions, but I believe it to be similar in that C++ method calls follow the exact same structure as C function calls (which use different registers than on Linux), just with an implicit this
argument first.