Search code examples
c++c++11polymorphismvtable

Undefined behavior in replacement for polymorphism with out Pointers


I've recently asked about this on this question. And I've gone with this generic approach:

#define COFFEE_GEN_GENERIC_VALUE(CLASS_TYPE) using GenericValue##CLASS_TYPE = coffee::utils::GenericValue<CLASS_TYPE, COFFEE_GENERIC_VALUE_CONFIG_MAX_DERIVED_SIZE_OF_##CLASS_TYPE>

template<typename Base, size_t MaxDerivedSize>
class GenericValue
{
private:
    char buffer[MaxDerivedSize];

public:
    GenericValue() = default;

    template<typename Derived, typename... ConstructorArgs>
    static GenericValue<Base, MaxDerivedSize> create(ConstructorArgs... args)
    {
        static_assert(std::is_base_of<Base, Derived>::value, "Derived must derive from Base.");
        static_assert(sizeof(Derived) <= MaxDerivedSize, "The size specified with the macro COFFEE_GenericValueMaxSize is to small.");
        GenericValue<Base, MaxDerivedSize> result{};
        new ((Derived*) result.buffer) Derived{args...};
        return result;
    }

    template<typename Derived>            
    static GenericValue<Base, MaxDerivedSize> from(Derived* toCopy)
    {
        static_assert(std::is_base_of<Base, Derived>::value, "Derived must derive from Base.");
        static_assert(sizeof(Derived) <= sizeof(MaxDerivedSize), "The size specified with the macro COFFEE_GenericValueMaxSize is to small.");
        GenericValue<Base, MaxDerivedSize> result{};
        memcopy(result.buffer, toCopy, sizeof(Derived));
        return result;
    }
    
    GenericValue(const GenericValue<Base, MaxDerivedSize>& other)
    {
        memcopy(buffer, other.buffer, MaxDerivedSize);
    }
    
    GenericValue(GenericValue<Base, MaxDerivedSize>&& other)
    {
        memcopy(buffer, other.buffer, MaxDerivedSize);
    }

    GenericValue<Base, MaxDerivedSize>& operator=(const GenericValue<Base, MaxDerivedSize>& other)
    {
        memcopy(buffer, other.buffer, MaxDerivedSize);
        return *this;
    }
    
    GenericValue<Base, MaxDerivedSize>& operator=(GenericValue<Base, MaxDerivedSize>&& other)
    {
        coffee::utils::copy(buffer, other.buffer, MaxDerivedSize);
        return *this;
    }

    Base& operator*()
    {
        return *(Base*)buffer;
    }

    Base* operator->()
    {
        return (Base*)buffer;
    }
}

I had the idea for this form this website and its kind of the same as the awnser from @Matthias Grün on my old question. You use the COFFEE_GEN_GENERIC_VALUE macro to create a typedef for GenericValue<Base, MaxDerivedSize> the size will be passed to it with a second macro. This is so that my library can generate the appropriate GenericValue types that will be used for the specific base class. The basic idea behind it is that you store the whole instance of the derived type with it's vtable in buffer. You can than copy it just like a regular rvalue.

Unfortunately doe copying only some times works and I have no idea why, based on my knowledge this would not be a problem. If I remove the copy operators and constructor and just use the class as a wrapper for new it works great (except the objects aren't realy copied).

Does any body know what the problem could be?

I will post a specific example when I found an edge case for which I don't have to post my whole project.


Solution

  • memcpy of data that isn't "plain old data" is undefined behavior. When you are doing shenanigans, avoid undefined behavior.

    So the first thing you need to do this is to store a pointer-to-copy operation beside your buffer. Then when you copy your buffer, instead of memcpy, you call that pointer-to-copy operation.

    A simple way to do this is with a void(*)(void const* src, void* dst) function pointer that copies from src to dst.

    Casting memory to a type it isn't and using that pointer is undefined behavior.

    So the second thing you need to do is to store a way to get the pointer-to-base from the memory buffer. A simple way to do this is to store a Base*(*)(void*) function pointer, and pass it a pointer to the buffer.

    For the cost of a bit of indirection, you can reduce the per-instance overhead to a single pointer like this.

    struct BasicVTable {
      void(* destroy)(void*) = 0;
      template<class T>
      static BasicVTable make() {
        return {[](void* ptr){ static_cast<T*>(ptr)->~T(); }};
      }
      template<class T>
      static auto const* get() {
        static const auto vtable = make<T>();
        return &vtable;
      }
    };
    struct RegularVTable : BasicVTable {
      void(*assign)(void const* src, void* dst) = 0;
      void(*assign_move)(void* src, void* dst) = 0;
      void(*copy)(void const* src, void* dst) = 0;
      void(*move)(void* src, void* dst) = 0;
    
      
      template<class T>
      static RegularVTable make() {
        return {
          BasicVTable::template make<T>(),
          [](void const* src, void* dst) {
            *static_cast<T*>(dst) = *static_cast<T const*>(src);
          },
          [](void* src, void* dst) {
            *static_cast<T*>(dst) = std::move(*static_cast<T*>(src));
          },
          [](void const* src, void* dst) {
            ::new(dst) T(*static_cast<T const*>(src));
          },
          [](void* src, void* dst) {
            ::new(dst) T(std::move(*static_cast<T*>(src)));
          }
        };
      }
      template<class T>
      static auto const* get() {
        static const auto vtable = make<T>();
        return &vtable;
      }
    };
    
    template<class Base>
    struct PolyBaseVTable : RegularVTable {
      Base*(*GetBase)(void* src) = 0;
      Base const*(*cGetBase)(void const* src) = 0;
    
      template<class T>
      static PolyBaseVTable make() {
        return {
          RegularVTable::make<T>(),
          [](void* src)->Base* {
            return static_cast<T*>(src);
          },
          [](void const* src)->Base const* {
            return static_cast<T const*>(src);
          }
        };
      }
    
      template<class T>
      static auto const* get() {
        static const auto vtable = make<T>();
        return &vtable;
      }
    };
    

    now we modify your GenericValue:

    template<typename Base, size_t MaxDerivedSize>
    class GenericValue
    {
    private:
      PolyBaseVTable<Base> const* vtable = nullptr;
      char buffer[MaxDerivedSize];
    

    to store an extra pointer.

    We do some slight modifications:

    template<typename Derived, typename... ConstructorArgs>
    static GenericValue<Base, MaxDerivedSize> create(ConstructorArgs... args)
    {
        static_assert(std::is_base_of<Base, Derived>::value, "Derived must derive from Base.");
        static_assert(sizeof(Derived) <= MaxDerivedSize, "The size specified with the macro COFFEE_GenericValueMaxSize is to small.");
        GenericValue<Base, MaxDerivedSize> result{};
        ::new ((void*) result.buffer) Derived{args...}; // void* here
        result.vtable = PolyBaseVTable<Base>::template get<Derived>();
        return result;
    }
    

    where we initialize the vtable after we construct the object.

    Your from is nonsense. It is just another create.

    Copy construct is then:

    GenericValue(const GenericValue<Base, MaxDerivedSize>& other)
    {
      if (!other.vtable) return;
      other.vtable->copy( &other.buffer, &buffer );
      vtable = other.vtable;
    }
    

    with the other assignment/move operations working similarly. Assignment is a bit of a pain:

    GenericValue& operator=(const GenericValue& other)
    {
      if (vtable != other.vtable)
      {
        if (vtable) vtable->destroy(&buffer);
        vtable = nullptr;
        if (other.vtable)
        {
            other.vtable->copy(&other.buffer, &buffer);
            vtable=other.vtable;
            return *this;
        }
      }
      if (!vtable) return *this;
      vtable->assign( &other.buffer, &buffer );
      return *this;
    }
    

    as you have to handle a type-change possibility.

    For base access:

    Base* operator->()
    {
      if (!vtable) return nullptr;
      return vtable->GetBase( &buffer );
    }
    Base const* operator->() const
    {
      if (!vtable) return nullptr;
      return vtable->cGetBase( &buffer );
    }
    

    easy peasy.

    Now, this code did use extended aggregate rules from after , so the static make functions in the vtable will have to be modified to be more verbose if you are restricted to . But there is nothing really tricky there.

    Also, the automatic return type deduction (my use of auto in a few spots) may have to be replaced with actual types. The intended type isn't that hard to work out, I just didn't, because I didn't want to repeat myself there.

    You also need to invoke vtable->destroy on ~GenericValue of course.


    Now, you may note I made the table of operations a vtable. This is because I'm reimplementing a simple version of classic C++ based polymorphism in a C-with-enhanced-C++-code-generation way.

    You can actually use this to implement operations other than cast-to-base, to avoid an extra vtable hit. I'd only consider this in cases where the performance of the extra vtable hit is detected, or when I want to store hedrogenous objects (like how std::function works) and support polymorphism on them.

    Live example of this in operation.

    C++11 version.