Search code examples
c#asynchronousasync-awaitconcurrencyparallel.foreach

Parallel.ForEach or Task.WhenAll when involving async operations?


I've read the following closely related thread, but I'd like to ask about a more specific thing.

If we need to run Tasks/methods asynchronously, and those tasks themselves run other tasks/await other tasks, which variant is preferred - Parallel.ForEach, or Task.WhenAll? I will demonstrate with some code below:

public async Task SomeWorker(string param1, HttpClient client,
    List<FillMeUp> emptyCollection)
{
    HttpRequestMessage message = new HttpRequestMessage();
    message.Method = HttpMethod.Get;
    message.Headers.Add("someParam", param1);
    message.RequestUri = new Uri("https://www.somesite.me");
    var requestResponse = await client.SendAsync(message).ConfigureAwait(false);
    var content = await requestResponse.Content.ReadAsStringAsync()
        .ConfigureAwait(false);
    emptyCollection.Add(new FillMeUp()
    {
        Param1 = param1
    });
}

Used with WhenAll:

using (HttpClient client = new HttpClient())
{
    client.DefaultRequestHeaders.Add("Accept", "application/json");

    List<FullCollection> fullCollection = GetMyFullCollection();
    List<FillMeUp> emptyCollection = new List<FillMeUp>();
    List<Task> workers = new List<Task>();
    for (int i = 0; i < fullCollection.Count; i++)
    {
        workers.Add(SomeWorker(fullCollection[i].SomeParam, client,
            emptyCollection));
    }

    await Task.WhenAll(workers).ConfigureAwait(false);

    // Do something below with already completed tasks
}

Or, all of the above written in a Parallel.ForEach():

using (HttpClient client = new HttpClient())
{
    client.DefaultRequestHeaders.Add("Accept", "application/json");

    List<FullCollection> fullCollection = GetMyFullCollection();
    List<FillMeUp> emptyCollection = new List<FillMeUp>();
    Parallel.ForEach<FullCollection>(fullCollection, (fullObject) =>
    {
       HttpRequestMessage message = new HttpRequestMessage();
       message.Method = HttpMethod.Get;
       message.Headers.Add("someParam", fullObject.SomeParam);
       message.RequestUri = new Uri("https://www.somesite.me");
       var requestResponse = client.SendAsync(message)
           .GetAwaiter().GetResult();
       var content = requestResponse.Content.ReadAsStringAsync()
           .GetAwaiter().GetResult();
       emptyCollection.Add(new FillMeUp()
       {
          Param1 = fullObject.SomeParam
       });
    });
}

I'm aware that Lists are not thread safe. It's just something to demonstrate the nature of my question.

Both methods of HttpClient (SendAsync and ReadAsStringAsync) are asynchronous, and as such must be called synchronously in order to work with Parallel.ForEach.

Is that preferred over the Task.WhenAll route? I've tried various performance tests, and I can't seem to find a difference.


Solution

  • I don't think the main consideration here is performance. (It always is :-) but read on - using the correct construct in the correct case will guarantee you the best performance)

    Think of Parallel.ForEach as a special ForEach which is parallelizing the individual (synchronous) tasks. While you could shove already asynchronous operations in it (by blocking), it seems contrived and misused - you will lose the async/await benefits of each tasks by doing so. The only "benefit" that you get out of it is that its behavior from the stand point of view of your code flow is synchronous - it will not complete until all threads it spawned return.

    Since your individual tasks are already async, it is the latest feature of the Parallel.ForEach that Task.WhenAll gives you.