Search code examples
c++templatesc++14type-erasure

C++ Type-erasure of a function template using lambdas


I'm trying to type erase an object and ran into a bit of an issue that I'm hoping someone here may be have expertise in.

I haven't had a problem type-erasing arbitrary non-templated functions; so far what I have been doing is creating a custom static "virtual table"-esque collection of function pointers. This is all managed with non-capturing lambdas, since they decay into free-function pointers:

template<typename Value, typename Key>
class VTable {
    Value (*)(const void*, const Key&) at_function_ptr = nullptr;
    // ...

    template<typename T>
    static void build_vtable( VTable* table ) {
        // normalizes function into a simple 'Value (*)(const void*, const Key&)'' type
        static const auto at_function = []( const void* p, const Key& key ) {
            return static_cast<const T*>(p)->at(key);
        }
        // ...
        table->at_function_ptr = +at_function;
    }
    // ...
}

(There are more helper functions/aliases that are omitted for brevity)

Sadly this same approach does not work with a function template.

My desire is for the type-erased class to have something akin to the following:

template<typename U>
U convert( const void* ptr )
{
    return cast<U>( static_cast<const T*>( ptr ) );
}

where:

  • cast is a free function,
  • U is the type being casted to,
  • T is the underlying type erased type being casted from, and
  • ptr is the type-erased pointer that follows the same idiom above for the type erasure.

[Edit: The issue above is that T isn't known from the function convert; the only function that knows of T's type in the example is build_vtable. This may just require a design change]

The reason this has become challenging is that there does not appear to be any simple way to type erase both types independently. The classical/idiomatic type-erasure technique of a base-class doesn't work here, since you can't have a virtual template function. I have experimented with a visitor-like pattern with little success for similar reasons to the above.

Does anyone with experience in type-erasure have any suggestions or techniques that can be used to achieve what I'm trying to do? Preferably in standards-conforming c++14 code. Or, perhaps is there design change that might facilitate the same concept desired here?

I've been searching around for this answer for a little while now, and haven't had much luck. There are a few cases that are similar to what I'm trying to do, but often with enough differences that the solutions don't seem to apply to the same problem (Please let me know if I'm wrong!).

It appears most readings/blogs on these topics tend to cover the basic type-erasure technique, but not what I'm looking for here!

Thanks!

Note: please do not recommend Boost. I am in an environment where I am unable to use their libraries, and do not wish to introduce that dependency to the codebase.


Solution

  • Each distinct convert<U> is a distinct type erasure.

    You can type erase a list of such functions, storing the method of doing it in each case. So suppose you have Us..., type erase all of convert<Us>....

    If Us... is short this is easy.

    If it is long, this is a pain.

    It is possible that the majority of these may be null (as in operation is illegal), so you can implement a sparse vtable that takes this into account, so your vtable isn't large and full of zeros. This can be done by type erasing a function (using the standard vtable technique) that returns a reference (or a type-erased accessor) to said sparse vtable that maps from std::typeindex to U-placement-constructor converter (that writes to a void* in the signature). You then run that function, extract the entry, create a buffer to store the U in, call the U-placement-constructor converter passing in that buffer.

    This all occurs in your type_erased_convert<U> function (which itself is not type-erased) so end users don't have to care about the internal details.

    You know, simple.

    The restriction is that the list of possible convert-to types U that are supported needs to be located prior to the location of type erasure. Personally, I would restrict type_erased_convert<U> to only being called on the same list of types U, and accept that this list must be fundamentally short.


    Or you could create some other conversion graph that lets you plug a type into it and determine how to reach another type possibly through some common intermediary.

    Or you could use a scripting or bytecode language that includes a full compiler during the execution phase, permitting the type-erased method to be compiled against a new completely independant type when called.


    std::function< void(void const*, void*) > constructor;
    
    std::function< constructor( std::typeindex ) > ctor_map;
    
    template<class...Us>
    struct type_list {};
    
    using target_types = type_list<int, double, std::string>;
    
    template<class T, class U>
    constructor do_convert( std::false_type ) { return {}; }
    template<class T, class U>
    constructor do_convert( std::true_type ) {
      return []( void const* tin, void* uout ) {
        new(uout) U(cast<U>( static_cast<const T*>( ptr ) ));
      };
    }
    
    template<class T, class...Us>
    ctor_map get_ctor_map(std::type_list<Us...>) {
      std::unordered_map< std::typeindex, constructor > retval;
      using discard = int[];
      (void)discard{0,(void(
        can_convert<U(T)>{}?
          (retval[typeid(U)] = do_convert<T,U>( can_convert<U(T)>{} )),0
        : 0
      ),0)...};
      return [retval]( std::typeindex index ) {
        auto it = retval.find(index);
        if (it == retval.end()) return {};
        return it->second;
      };
    }
    
    template<class T>
    ctor_map get_ctor_map() {
      return get_ctor_map<T>(target_types);
    }
    

    You can replace the unordered_map with a compact stack-based one when it is small. Note that std::function in MSVC is limited to about 64 bytes or so?


    If you don't want a fixed list of source/dest types, we can decouple this.

    • Expose the typeindex of the type stored within the type erasure container, and the ability to get at the void const* that points at it.

    • Create a type trait that maps a type T to the list of types Us... it supports conversion-to. Use the above technique to store these conversion functions in a (global) map. (Note that this map can be placed in static storage, as you can deduce the size of the buffer required etc. But using an static unordered_map is easier).

    • Create a second type trait that maps a type U to a list of types Ts... it supports conversion-from.

    • In both cases, a function convert_construct( T const* src, tag_t<U>, void* dest ) is called to do the actual conversion.

    You'd start with a set of universal targets type_list<int, std::string, whatever>. A particular type would augment it by having a new list.

    For a type T building its sparse conversion table we would attempt each target type. If an overload of convert_construct fails to be found, the map would not be populated for that case. (Generating compile time errors for types added explicitly to work with T is an option).

    On the other end, when we call the type_erased_convert_to<U>( from ), we look for a different table that maps the type U cross typeindex to a U(*)(void const* src) converter. Both the from-T map gotten from the type-erased T and the to-U gotten in the wrapping code are consulted to find a converter.

    Now, this doesn't permit certain kinds of conversion. For example, a type T that converts-from anything with a .data() -> U* and .size() -> size_t method needs to explicitly list every type it converts-from.

    The next step would be to admit a multi-step conversion. A multi-step conversion is where you teach your T to convert-to some (set of) famous types, and we teach U to convert-from a similar (set of) famous types. (The fame of these types is optional, I'll admit; all you need to know is how to create and destroy them, what storage you need, and a way to match up the T-to and U-from options, to use them as an intermediary.)

    This may seem over engineered. But the ability to convert-to std::int64_t and convert-from that to any signed integral type is an example of this (and similarly for uint64_t and unsigned).

    Or the ability to convert-to a dictionary of key-value pairs, and then examine this dictionary on the other side to determine if we can convert-from it.

    As you go down this path, you'll want to examine loose typing systems in various scripting and bytecode languages to pick up how they did it.