Search code examples
c++factory-patterndlopen

dlopen, factory pattern and the virtual method table


I'm trying to wrap my head around how the factory pattern works internally when using dlopen in c++. Sorry for long post.

tl;dr; Question is in Bold below.

Snippets from http://www.tldp.org/HOWTO/C++-dlopen/thesolution.html with error checking removed to save space:

main.cpp

#include "polygon.hpp"
#include <iostream>
#include <dlfcn.h>

int main()
{
    using std::cout;

    // load the triangle library
    void* triangle = dlopen("./triangle.so", RTLD_LAZY);

    // load the symbols
    // create function pointers
    // (Exposed with extern "C")
    polygon* create_triangle = (polygon*) dlsym(triangle, "create"); 
    void* destroy_triangle = (void*) dlsym(triangle, "destroy");

    // create an instance of the class
    polygon* poly = create_triangle();

    // use the class
    poly->set_side_length(7);
    cout << "The area is: " << poly->area() << '\n';

    destroy_triangle(poly); // destroy the class

    dlclose(triangle);     // unload the triangle library

}

polygon.hpp

#ifndef POLYGON_HPP
#define POLYGON_HPP

class polygon
{
protected:
    double side_length_;

public:
    polygon()
        : side_length_(0) {}

    virtual ~polygon() {}

    void set_side_length(double side_length)
    {
        side_length_ = side_length;
    }

    virtual double area() const = 0;
};

// the types of the class factories
typedef polygon* create_t();
typedef void destroy_t(polygon*);

#endif

triangle.hpp

#include "polygon.hpp"
#include <cmath>

class triangle : public polygon
{
public:
    virtual double area() const
    {
        return side_length_ * side_length_ * sqrt(3) / 2;
    }
};


// the class factories

extern "C" polygon* create()
{
    return new triangle;
}

extern "C" void destroy(polygon* p)
{
    delete p;
}

So looking at the main() function what I see is this.

  1. dlopen creates a handle.
  2. Function pointers created so that the triangle class can create a new triangle object and destroy it. (dlopen is giving us a memory location to branch to.)
  3. create_triangle() returns a triangle casted into a polygon (since we know the methods of the polygon.
  4. We set the internal member side_length_ using the base class's set_side_length method.

Here is the question:

When poly->area() is called how is this found in the triangle object?

  • We know where in memory the base class has it's "virtual area()" method.
  • Since "triangle.so" is dynamically loaded, the compiler isn't able to say on I recognize that the triangle area() is overriding the polygon area() in your program.
  • The names are completely mangled at this point and the .so could have been compiled with clang++ and the program could have been compiled with g++. So potentially having no hope of recognizing them at this point.

Does the Virtual Member Table keep the virtual methods in order according to when they appear? So would this code break it?

polygon.hpp

...
virtual double area() const = 0;
virtual double parameter() const = 0;
...

triangle.hpp

...
double parameter() const { ... } // implementing and defining parameter first
area() const { ... } // implementing and defining second.
...

I know you would want to keep them in order... but let's say we subclass a couple more times and they get defined in a different order...

Any help on this would be great. I just can't visualize what is going on in the memory here to make this actually work.

Thanks! And sorry for the long post.


Solution

  • When poly->area() is called how is this found in the triangle object?

    All initialization of poly happens inside the library (including setting the vptr). The only thing that caller (i.e. executable) has to do is to load virtual method pointer from poly's vtable at particular index. Both executable and shlib share the declaration for poly's class so they both agree which vtable element corresponds to area.

    Note that compiler does not need to know anything about how implementation of poly overloads the base class.

    the .so could have been compiled with clang++ and the program could have been compiled with g++

    Clang would struggle really hard to be ABI-compatible with GCC on Linux platforms (with Visual Studio on Windows platforms) to achieve this kind of interoperability.

    I know you would want to keep them in order... but let's say we subclass a couple more times and they get defined in a different order...

    Subclassing wouldn't change the structure of base class vtable which is fixed at this point (otherwise polymorphism would break as well).