Search code examples
c#async-awaitparallel.foreach

Write an async method with Parallel.Foreach loop that does call another async method to pull record


I am working on code performance optimizing and need suggestion the best approach in implementing async with parallel.foreach and/ or WhenAll.

The code is split into three main areas.

Code Definition

MethodA pickup the Customer list

MethodB PartA loop through Customers and pick record from database via a Azure Function. this is 1:* relation so 1 customer can have multiple records.

MethodB PartB go through customer records list that picked in method B Part A and look if is there any files attached. If there is file/ files then it process and send `Customer Reference' back to 'MethodA' where it store record in dictionary. Then it send

Method A

public async Task<List<Customers>> MethodA(){

  List<Customer> customers = await GetAllCustomers();
  var inboundCustomerFiles= new List<InboundCustomerFiles>(); 

   Parallel.ForEach(customer, async customer =>
   {
     var processedCustomer = await MethodB(customer);
     inboundCustomersFiles.AddRange(processedCustomer);

   });
}

Method B

  public static async Task<List<InboundCustomerFiles>> MethodB(Customer customer){
     var customerRecord = await GetCustomerRecord(customerId);

     foreach(var customer in customerRecord){
        var files = await getCustomerRecordFile(customerRecordId)
        //...... remaining code
     }
    return inboundCustomerFiles;
  }

Method 3

public static async Task<List<InboundCustomerFiles>> GetCustomerRecord(int customerId){
     //API call here that further pull `record` from database
return List<Records>();
}

the process in methodB customerRecord takes time. How I ensure that it process the data and return to correct customer thread in MethodA. I have tried to use in methodB but it slow down, further I know Parallel.Foreach does not wait so I tried to add async reference in lambda expression but not sure A is correct or if is work.


Solution

  • Well for one thing you can pretend that Parallel.ForEach awaits your async functions, but it doesn't. Instead you want to write something like this:

       await Task.WhenAll(customers.Select(async customer =>
       {
         var processedCustomer = await MethodB(customer);
         inboundCustomersFiles.AddRange(processedCustomer);
       }));
    

    Task.WhenAll behaves like Parallel.ForEach, but it's awaitable and it also awaits every task you pass to it before completing its own task. Hence when your await Task.WhenAll completes, all the inner tasks have completely completed as well.

    the process in methodB customerRecord takes time

    That is very ambiguous. If you mean it takes server and/or IO time then that's fine, that's what async is for. If you mean it takes your CPU time (ie it processes data locally for a long time), then you should spin up a task on a thread pool and await its completion. Note that this is not necessarily the default thread pool! Especially if you're writing an ASP.NET (core) application, you want a dedicated thread pool just for this stuff.