Search code examples
c++c++17static-librariesvirtual-functionsvtable

How to debug missing functions in vtable of base class


By using objects of the following structure(some bits simplified):

  class RafkoAgent{
    virtual std::vector<double> solve(std::vector<double>& in) = 0;
    void solve(std::vector<double>& in, std::vector<double>& out){
      out = solve(in);
    }
  };

  class SolutionSolver : public RafkoAgent {
    std::vector<double> solve(std::vector<double>& in){ /* ... */}
  };

In some cases a segfault is being thrown.

In my understanding, this structure should be adequate, but in the example I tried it in(here), it throws a segfault and I can't seem to get to the bottom of it.

In the code, when I check the vtable of the 2 symbols I find the following difference:

150   solver->solve({1.0, 2.0});
(gdb) info vtbl solver
This object does not have a virtual function table
(gdb) info vtbl *solver
vtable for 'rafko_net::SolutionSolver' @ 0x555555692bf0 (subobject @ 0x5555556ad710):
[0]: 0x55555559ff42 <rafko_gym::RafkoAgent::get_step_sources[abi:cxx11]() const>
[1]: 0x5555555a016c <rafko_gym::RafkoAgent::[abi:cxx11]() const>
[2]: 0x5555555a030a <rafko_gym::RafkoAgent::get_input_shapes() const>
[3]: 0x5555555a0502 <rafko_gym::RafkoAgent::get_output_shapes() const>
[4]: 0x5555555a074a <rafko_gym::RafkoAgent::get_solution_space() const>
[5]: 0x5555555a4c8e <rafko_net::SolutionSolver::~SolutionSolver()>
[6]: 0x5555555a4cdc <rafko_net::SolutionSolver::~SolutionSolver()>
[7]: 0x5555555a0a80 <rafko_net::SolutionSolver::set_eval_mode(bool)>
[8]: 0x55555559e42a <rafko_net::SolutionSolver::solve(std::vector<double, std::allocator<double> > const&, rafko_utilities::DataRingbuffer<std::vector<double, std::allocator<double> > >&, std::vector<std::reference_wrapper<std::vector<double, std::allocator<double> > >, std::allocator<std::reference_wrapper<std::vector<double, std::allocator<double> > > > > const&, unsigned int, unsigned int) const>
(gdb) info vtbl *(rafko_gym::RafkoAgent*)solver
vtable for 'rafko_gym::RafkoAgent' @ 0x555555692bf0 (subobject @ 0x5555556ad710):
[0]: 0x55555559ff42 <rafko_gym::RafkoAgent::get_step_sources[abi:cxx11]() const>
[1]: 0x5555555a016c <rafko_gym::RafkoAgent::get_step_names[abi:cxx11]() const>

Which suggest to me that something is not quite right with the vtable for RafkoAgent. The actual segfault is thrown when it tries to access its virtual function, and instead of solve, gdb seem to step into get_step_names, which is at the end of RafkoAgent's displayed vtable.

An additional detail is that the whole project is a static library + tests. When I run the same code inside the CMake project of the library ( i.e.: where RafkoAgent and SolutionSolver is) the segfault does not occur. It does occur however in the linked example file, where I link the classes from a generated static library file (librafko.a)

The focus of the question is:

Is it by design for a base class to not contain its defined virtual functions in its own vtable, only the derived class? even if the derived class does not introduce additional virtual functions?

If what I'm seeing is faulty how might I be able to debug the root cause of this error?


Solution

  • The root cause of the problem was that there were compilation macros used in the exported header files.

    The main hint for this was that in the same repository it worked flawlessly, but it failed by the exported library.

    The faulty structure was as follows:

    class A {};
    
    class B 
    #if(compile_time_macro)
    : public A
    #endif
    {};
    

    So essentially the library object was built as if the inheritance was included, but since the exported headers did not contain the macro definition, the inheritance was not present there.

    This resulted in a very hardly traceable problem.

    Whenever you use macros BEWARE.