Search code examples
c++optimizationc++11inlining

Limits to inlining function wrappers in C++


My question concerns the application of inline optimisations on function wrappers in C++, consider the following code, the WorkerStructure object is initialised with a function wrapper that encapsulates some chunk of functionality. The function wrapper is then used when the WorkerStructure::doSomeWork method is invoked.

Will the functionality encapsulated by the workerFunction object be inlined when applied on the WorkerStructure::doSomeWork method?, obviously if the functionality is defined in some other translation unit, the workerFunction object only encapsulates a function pointer, are there any other circumstances where inlining will not be possible?

When a lambda function defined in a different translation unit is passed via the function wrapper, is it effectively equivalent to passing a function pointer?

struct WorkerStructure
{
    WorkerStructure(std::function <bool(float)> &f):workerFunction(f) {}

    void doSomeWork(float inputValue)
    {
        if(workerFunction(inputValue))
        {
            //do some conditional operation
        }
    }
    std::function <bool(float)> workerFunction ;
};

Solution

  • The polymorphic nature of std::function inherently makes it very very difficult to actually inline the call. Since a std::function can story any callable entity; how would you write the inlining code?

    It's somewhat like inlining virtual functions which are called through the base pointer with no other information available (aka no assignment from derived to base pointer prior to the invokation, which the compiler might use to enable inlining).

    Most of the time, std::function is implemented with a void* pointer and a function pointer to a specialization of a templated function, that does the actual invokation and casting and stuff. There are of course variants that use virtual functions to do this, and it's clearer with them why it's astonishingly hard. Even link-time opimization won't be able to do anything, since it doesn't matter, you already have all the information you can get at the call site (which isn't much).

    Here's a very crude version of std::function using the pointer to template function version, dealing only with the store and call aspect (leaving out memory management, copying, moving, resetting, space optimization etc.):

    template<class Sig>
    class function;
    
    template<class R, class... Args>
    class function<R(Args...)>{
      typedef R (*call_type)(void*, Args...);
      void* _obj;
      call_type _caller;
    
    public:
      template<class F>
      function(F f)
        : _obj(new F(f))
        , _caller([](void* p, Args... args){ return (*static_cast<F*>(p))(args...); })
      {}
    
      R operator()(Args... args) const{
        return _caller(_obj, args...);
      }
    };
    

    Live example. I think it'd be very hard to check what is actually inside of _obj and _caller and the point where the function is invoked.

    Just for reference, here's the version with virtual functions.