Search code examples
c#asynchronousasync-awaitconcurrencyparallel-processing

Task.Run in combination with parallel execution


I know that Task.Run is used for CPU Bound operations but I find it quite hard to understand when to use it in a scenario where you mix CPU and I/O bound operations. For example:

This is a little helper function which does some CPU work and also makes an http request:

public async Task<string> MakeHttpRequest(HttpRequestMessage request)
{
   //before that do some CPU bound operations for example preparing the 
   // request before sending it out or validating the request object
   var HttpClient = new HttpClient();
   var response = await HttpClient.SendAsync(request);
   var responseString = await response.Content.ReadAsStringAsync();
   return responseString;
}

Now I need to make multiple calls with that method and I want to parallelize it so I get better performance out of my application. For that I generate a list of tasks with the method calls like this:

//Variant 1
public List<Task<string>> GenerateTasks()
{
   HttpRequestMessage request = new HttpRequestMessage(); //...
   List<Task<string>> taskList = new()
   {
      MakeHttpRequest(request),
      MakeHttpRequest(request)
   };
   return taskList;
}

//Variant 2
public List<Task<string>> GenerateTasks2()
{
   HttpRequestMessage request = new HttpRequestMessage(); //...
   List<Task<string>> taskList = new()
   {
      Task.Run(() => MakeHttpRequest(request)),
      Task.Run(() => MakeHttpRequest(request))
   };
   return taskList;
}

//Variant 3 - There is even this:
public List<Task<string>> GenerateTasks3()
{
   HttpRequestMessage request = new HttpRequestMessage(); //...
   List<Task<string>> taskList = new()
   {
      Task.Run(async () => await MakeHttpRequest(request)),
      Task.Run(async () => await MakeHttpRequest(request))
   };
   return taskList;
}

At the end I would just do an await Task.WhenAll(GenerateTasks()). Pls note the lack of exception handling etc, it is just an example.

What is the difference between Variant 1,2 and 3? Since I am doing CPU Bound operations before the I/O operation would it be ok to use Task.Run or not? Are there any negative side effects if it is not ok to use Task.Run like this?


Solution

  • The Variant 2 and Variant 3 are practically identical, so I'll compare only the variant 1 (without Task.Run) versus the variant 2 (with Task.Run).

    1. Without Task.Run the MakeHttpRequest method is invoked synchronously on the current thread. The two Task<string> tasks are created sequentially, the one after the other. In other words the creation of the tasks is not parallelized.

    2. With Task.Run the MakeHttpRequest method is invoked in parallel on ThreadPool threads. The two Task<string> tasks are created concurrently and in parallel.

    The first variant is what Stephen Cleary calls asynchronous concurrency, distinguishing it from the parallelism, which is what the second variant does. The term "asynchronous concurrency" is not well defined though. It might mean that we avoid scheduling work on the ThreadPool pool (without actively enforcing a no-parallelization policy), or it might mean that we take measures to ensure that only one thread will be running code at any time, by using either a SynchronizationContext or an exclusive TaskScheduler. The variant 1 implements the first interpretation. Although the code before the await HttpClient.SendAsync(request) will be serialized (because it is called synchronously), any CPU-bound operation after the await might be parallelized.

    It might be useful to compare the two variants with the native .NET 6 Parallel.ForEachAsync API. This method behaves similarly to your variant 2 (with Task.Run). A proposal to enable the variant 1 behavior (without Task.Run) has been rejected by Microsoft.

    Both variants are valid, and in different scenarios the one might be more suitable than the other. For example in ASP.NET applications the variant 1 (without Task.Run) might be preferable, because the ASP.NET infrastructure uses the same ThreadPool for scheduling web requests, so we want to avoid putting additional stress to the ThreadPool by scheduling "parasitic" work to it. Quoting from an MSDN article.

    You can kick off some background work by awaiting Task.Run, but there’s no point in doing so. In fact, that will actually hurt your scalability by interfering with the ASP.NET thread pool heuristics. If you have CPU-bound work to do on ASP.NET, your best bet is to just execute it directly on the request thread. As a general rule, don’t queue work to the thread pool on ASP.NET.