Search code examples
c#closurestask-parallel-library

Reducing closure overhead in Task.Run/Factory.StartNew with predefined object


This is purely for experimental purposes and/or a learning exercise. In essence, I'd like to see if I can reduce the footprint of the closure created when we use Task.Run(()=>Func<>()) by creating a class that I initialize only once. One, the objective would be to avoid creating a 'new' instance of this every time we run, which would probably be less efficient than the closure itself I imagine (but this is mere speculation, I know). So, creating a basic class to do so is rather simple, as you can find examples of that here on the stack.

However, where I run into issue, is that it would appear to me, that if I want to use members and functions from another class, that having to encapsulate them, or inject them into the class we're going to Run on, while it may be less data than the original class itself, it's probably not going to be that much of an improvement.

So say, I have something along the lines of:

internal async Task<PathObject> PopulatePathObjectAsync(Vector3Int origin, Vector3Int destination, PathObject path)
{
    return await Task.Factory.StartNew(() => PopulatePathObject(origin, destination, path));
}

/// Not sure if we want to make this a task or not because we may just parallelize and await the outer task.
/// We'll have to decide when we get down to finalization of the architecture and how it's used.
internal PathObject PopulatePathObject(Vector3Int origin, Vector3Int destination, PathObject path)
{
    Debug.Log($"Pathfinding Search On Thread: ({System.Threading.Thread.CurrentThread.ManagedThreadId})");

    if (!TryVerifyPath(origin, destination, ref path, out PathingNode currentNode))
        return path;

    var openNodes = m_OpenNodeHeap;

    m_ClosedNodes.Clear();

    openNodes.ClearAndReset();
    openNodes.AddNode(currentNode);

    for (int i = CollectionBufferSize; openNodes.Count > 0 && i >= 0; i--)
    {
        currentNode = ProcessNextOpenNode(openNodes);

        if (NodePositionMatchesVector(currentNode, destination))
        {
            return path.PopulatePathBufferFromOriginToDestination(currentNode, origin, PathState.CompletePath);
        }

        ProcessNeighboringNodes(currentNode, destination);
    }

    return path.PopulatePathBufferFromOriginToDestination(currentNode, origin, PathState.IncompletePath);
}

In order to ditch the lambda, the closure, and the creation (or perhaps cast?) of the delegate, I would need a class that actually encapsulates that PopulatePathObject function in its entirety, either by literally copying the members necessary, or passing them as arguments. This all seems like it would probably render any benefits gained. So is there a way I could have something like..

private class PopulatePathObjectTask
{
    private readonly Vector2Int m_Origin;
    private readonly Vector3Int m_Destination;
    private readonly PathObject m_Path;

    public PopulatePathObjectTask(Vector2Int origin, Vector3Int destination, PathObject path)
    {
        m_Origin = origin;
        m_Destination = destination;
        m_Path = path;
    }

    public PathObject PopulatePathObject(Vector3Int origin, Vector3Int destination, PathObject path)
    {
        ///Obviously here, without access to the actual AStar class responsible for the search,
        ///I don't have access to the functions or the class members such as the heap or the hashset
        ///that represents the closed nodes as well as the calculated buffer size based on the space-state
        ///dimensions. With that, I'd just be recreating the class and not avoiding much, if any,
        ///of the overhead created by the closure capturing the class in the first place.
    }
}

That I could use to access the function that already exists? I've been toying with the idea of creating a static member and using dependency injection for the open/closed node collections, but I thought, or rather hoped, someone might have some more insight into this, other than it's pointless and the even possible overhead reduction or performance gains will be so minimal that it's pointless. Which, granted you're probably right, but I'm doing this as an exercise and I'd like to be able to actually measure the differences. I'm probably not even going to use it, might even ditch the AStar for JPS instead, but I would like to know before moving on. I'm not entirely sure, but it would seem as if the closure would have to have the entire AStar object captured in time, one would hope by reference.


Solution

  • Gladly, you can generalize the concept which was mentioned by JonasH

    Task ExecuteActionAsync<TState>(
        Action<TState> callback, 
        TState state)
    {
        return Task.Factory.StartNew(static args => 
        {
            var local = (ValueTuple<TState, Action<TState>>)args;
            local.Item2(local.Item1);
        }, (state, callback));
    }
    
    Task<TResult> ExecuteFuncAsync<TState, TResult>(
        Func<TState, TResult> callback,
        TState state)
    {
        return Task.Factory.StartNew<TResult>(static args => 
        {
            var local = (ValueTuple<TState, Func<TState, TResult>>)args;
            return local.Item2(local.Item1);
        }, (state, callback));
    }
    

    Notes:

    • In order to avoid any extra closure based overhead the StartNew's action parameter is declared as static which has compile time guarantees to avoid any closure usage.
    • You need to cast the args to a ValueTuple because the StartNew's overloads are using Object for state parameter
    • As far as I know you can't cast to named tuples that's why the actual method call (local.Item2(local.Item1)) looks this ugly

    You can add an extra line there to decompose the ValueTuple if you wish

    var local = (ValueTuple<TState, Func<TState, TResult>>)args;
    var (localCallback, localState) = (local.Item2, local.Item1);
    return localCallback(localState);
    

    Here is a sample usage

    await ExecuteActionAsync(Console.WriteLine, 42);
    
    var res = await ExecuteFuncAsync(_ => _, 42);
    Console.WriteLine(res);
    

    Dotnet fiddle: https://dotnetfiddle.net/aHyTMA


    To apply this to your use case

    record struct PopulatePathArgs(Vector2Int origin, Vector3Int destination, PathObject path);
    
    ...
    public PathObject PopulatePathObject(PopulatePathArgs args)
    
    ...
    
    var result = await ExecuteFuncAsync(PopulatePathObject, new PopulatePathArgs(...));
    //or more verbosely 
    var result = await ExecuteFuncAsync(_ => PopulatePathObject(_), new PopulatePathArgs(...));
    

    UPDATE #1

    I was not aware we could apply a static modifier to a lambda function

    This feature was introduced in C# 9: https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/proposals/csharp-9.0/static-anonymous-functions

    However, everything I can come up with or have seem, always has "new" somewhere.

    Since you are working with an object oriented programming language that's should not be a big deal to new stuff. Even if you call methodB from methodA the runtime allocates memory because a new StackFrame is created.

    If the allocated variables' scope is the function that means they will be cleaned up whenever you leave the function. If you have a closure then you want to access a variable outside of the scope of the original function. That's why an anonymous class is created (on your behalf) to capture those variables. So, with closure you are extending the lifecycle of the variables and the enclosing class is allocated on the heap. That object will be cleaned up later by the GC.

    Both JonasH's and mine solutions are avoiding the usage of closure so, the allocated memory is cleaned up upon function exit. So, don't worry about memory allocation.

    The drawback of the proposed solutions on the other hand that you have to define an enclosing data structure (record, class, struct, whatever).

    I've also read the discard operator can prevent memory allocation, but I've yet to measure that effect.

    There you can also use static if you wish:

    var res = await ExecuteFuncAsync(static _ => _, 42);
    

    You can also define an Identity function if you wish, something like this:

    public static class HelperFunctions
    {
        public static T Identity<T>(T value) => value;
    }
    ...
    
    var res = await ExecuteFuncAsync(Identity, 42);