Search code examples
c#async-awaitparallel-processing

Trying to run multiple tasks in parallel with .WhenAll but tasks aren't being run in parallel


I have a method that converts a csv file into a particular model which I want to split up into multiple tasks as there's 700k+ records. I'm using .Skip and .Take in the method so each running of that method knows where to start and how many to take. I have a list of numbers 1-10 that I want to iterate over and create tasks to run this method using that iterator to create the tasks and do some math to determine how many records to skip.

Here's how I'm creating the tasks:

var numberOfTasksList = Enumerable.Range(1, 10).ToList();
//I left out the math to determine rowsPerTask used as a parameter in the below method for brevity's sake
var tasks = numberOfTasksList.Select(i
                =>  ReadRowsList<T>(props, fields, csv, context, zohoEntities, i, i*rowsPerTask, (i-1)*rowsPerTask));

           await Task.WhenAll(tasks);

The ReadRowsList method used looks like this (without the parameters):

public static async Task<string> ReadRowsList<T>(...parameters) where T : class, new()
   {
     //work to run
     return $"added rows for task {i}";
   }

That method's string that it returns is just a simple line that says $"added rows for task {i}" so it's not really a proper async/await as I'm just returning a string to say when that iteration is done.

However, when I run the program, the method waits for the first iteration (where i=1) to complete before starting the second iteration of running the program, so it's not running in parallel. I'm not the best when it comes to async/parallel programming, but is there something obvious going on that would cause the task to have to wait until the previous iteration finishes before the next task gets started? From my understanding, using the above code to create tasks and using .WhenAll(tasks) would create a new thread for each iteration, but I must be missing something.


Solution

  • In short:

    1. async does not equal multiple threads; and
    2. making a function async Task does not make it asynchronous

    When Task.WhenAll is run with pretend async code that has no awaits, the current thread cannot 'let go' of the task at hand and it cannot start processing a different task.

    As it was pointed out in the comments, the build chain warns you about it with: This async method lacks 'await' operators and will run synchronously. Consider using the 'await' operator to await non-blocking API calls, or 'await Task.Run(...)' to do CPU-bound work on a background thread.

    Trivial example

    Let's consider two functions with identical signatures, one with async code and one without.

    static async Task DoWorkPretendAsync(int taskId)
    {
        Console.WriteLine($"Thread: {Thread.CurrentThread.ManagedThreadId} -> task:{taskId} > start");
        Thread.Sleep(TimeSpan.FromSeconds(1));
        Console.WriteLine($"Thread: {Thread.CurrentThread.ManagedThreadId} -> task:{taskId} > done");
    }
    
    static async Task DoWorkAsync(int taskId)
    {
        Console.WriteLine($"Thread: {Thread.CurrentThread.ManagedThreadId} -> task:{taskId} > start");
        await Task.Delay(TimeSpan.FromSeconds(1));
        Console.WriteLine($"Thread: {Thread.CurrentThread.ManagedThreadId} -> task:{taskId} > done");
    }
    

    Let's test them with the following snippet:

    await DoItAsync(DoWorkPretendAsync);
    Console.WriteLine();
    await DoItAsync(DoWorkAsync);
    
    async Task DoItAsync(Func<int, Task> f)
    {
        var tasks = Enumerable.Range(start: 0, count: 3).Select(i => f(i));
        Console.WriteLine("Before WhenAll");
        await Task.WhenAll(tasks);
        Console.WriteLine("After WhenAll");
    }
    

    W can see that with DoWorkPretendAsync the tasks are executed sequentially.

    Before WhenAll
    Thread: 1 -> task:0 > start
    Thread: 1 -> task:0 > done
    Thread: 1 -> task:1 > start
    Thread: 1 -> task:1 > done
    Thread: 1 -> task:2 > start
    Thread: 1 -> task:2 > done
    After WhenAll
    
    Before WhenAll
    Thread: 1 -> task:0 > start
    Thread: 1 -> task:1 > start
    Thread: 1 -> task:2 > start
    Thread: 5 -> task:0 > done
    Thread: 5 -> task:2 > done
    Thread: 7 -> task:1 > done
    After WhenAll
    

    Things to note:

    • even with real async all tasks are started by the same thread;
    • in this particular run two of the tasks are finished by the same thread (id:5). This is not guaranteed at all - a task can be started on one thread and then another thread in the pool can pick it up later.