Search code examples
c++c++11lambdadelegatesstd-function

FastDelegate and lambdas - can't get them to work (Don Clugston's fastest possible delegates)


I'm trying to create a C++11 implementation of Don Clugston's Member Function Pointers and the Fastest Possible C++ Delegates, and make it work as a drop-in std::function replacement.

This is what I got so far.

I construct lambda FastDelegates like this:

// FastFunc is my name for FastDelegate
template<typename LambdaType> FastFunc(LambdaType lambdaExpression)
{
    this->m_Closure.bindmemfunc(&lambdaExpression, &LambdaType::operator());
}

Now, some tests:

FastFunc<void()> test = []{ std::cout << "hello" << std::endl; };
test();
// Correctly prints "hello"

bool b{false};
FastFunc<void()> test2 = [&b]{ std::cout << b << std::endl; };
test2();
// Crash!

As you can see, when the lambda is "trivial" (no captures), copying it by value and taking its address works. But when the lambda stores some kind of state (captures), I cannot just copy it by value into the FastFunc.

I tried getting the lambda by reference, but I cannot do that when it's a temporary like in the example.

I have to somehow store the lambda inside the FastFunc, but I don't want to use std::shared_ptr because it's slow (I tried a different fastdelegate implementation that used it, and its performance was comparable to std::function).

How can I make my implementation of Don Clugston's fastest possible C++ delegates work with lambdas that capture state, preserving the amazing performance of fastdelegates?


Solution

  • You have diagnosed the situation well: you need to store the state.

    Since the lambda is a temporary object, you are actually allowed to move from it (normally) which should be preferred to a copy if possible (because move is more general than copy).

    Now, all you need to do is to reserve some storage for it, and if this requires a dynamic allocation you might indeed get a performance degradation. On the other hand, an object need have a fixed foot-print, so ?

    One possible solution is to offer a configurable (but limited) storage capacity:

    static size_t const Size = 32;
    static size_t const Alignment = alignof(std::max_align_t);
    
    typedef std::aligned_storage<Size, Alignment>::type Storage;
    Storage storage;
    

    Now you can (using reinterpret_cast as necessary) store your lambda within storage provided its size fit (which can be detected using static_assert).

    Finally managed to get a working example (had to restart from scratch because god is that fast delegate code verbose !!), you can see it in action here (and the code is below).

    I have only scratch the surface, notably because it lacks copy and move operators. To do so properly those operations need be added to the handler following the same pattern than the two other operations.

    Code:

    #include <cstddef>
    
    #include <iostream>
    #include <memory>
    #include <type_traits>
    
    template <typename, size_t> class FastFunc;
    
    template <typename R, typename... Args, size_t Size>
    class FastFunc<R(Args...), Size> {
    public:
        template <typename F>
        FastFunc(F f): handler(&Get<F>()) {
            new (&storage) F(std::move(f));
        }
    
        ~FastFunc() {
            handler->destroy(&storage);
        }
    
        R operator()(Args&&... args) {
          return handler->apply(&storage, std::forward<Args>(args)...);
        }
    
    private:
        using Storage = typename std::aligned_storage<Size, alignof(max_align_t)>::type;
    
        struct Handler {
            R (*apply)(void*, Args&&...);
            void (*destroy)(void*);
        }; // struct Handler
    
        template <typename F>
        static R Apply(void* f, Args&&... args) {
            (*reinterpret_cast<F*>(f))(std::forward<Args>(args)...);
        }
    
        template <typename F>
        static void Destroy(void* f) {
            reinterpret_cast<F*>(f)->~F();
        }
    
        template <typename F>
        Handler const& Get() {
            static Handler const H = { &Apply<F>, &Destroy<F> };
            return H;
        } // Get
    
        Handler const* handler;
        Storage storage;
    }; // class FastFunc
    
    int main() {
        FastFunc<void(), 32> stateless = []() { std::cout << "stateless\n"; };
        stateless();
    
        bool b = true;
        FastFunc<void(), 32> stateful = [&b]() { std::cout << "stateful: " << b << "\n"; };
        stateful();
    
        b = false;
        stateful();
    
        return 0;
    }