I am working on code performance optimizing and need suggestion the best approach in implementing async
with parallel.foreach
and/ or WhenAll
.
The code is split into three main areas.
Code Definition
MethodA
pickup the Customer
list
MethodB
PartA
loop through Customers and pick record from database via a Azure Function
. this is 1:* relation so 1 customer can have multiple records.
MethodB
PartB
go through customer records list that picked in method B Part A and look if is there any files attached. If there is file/ files then it process and send `Customer Reference' back to 'MethodA' where it store record in dictionary. Then it send
public async Task<List<Customers>> MethodA(){
List<Customer> customers = await GetAllCustomers();
var inboundCustomerFiles= new List<InboundCustomerFiles>();
Parallel.ForEach(customer, async customer =>
{
var processedCustomer = await MethodB(customer);
inboundCustomersFiles.AddRange(processedCustomer);
});
}
public static async Task<List<InboundCustomerFiles>> MethodB(Customer customer){
var customerRecord = await GetCustomerRecord(customerId);
foreach(var customer in customerRecord){
var files = await getCustomerRecordFile(customerRecordId)
//...... remaining code
}
return inboundCustomerFiles;
}
public static async Task<List<InboundCustomerFiles>> GetCustomerRecord(int customerId){
//API call here that further pull `record` from database
return List<Records>();
}
the process in methodB customerRecord
takes time. How I ensure that it process the data and return to correct customer thread in MethodA. I have tried to use in methodB but it slow down, further I know Parallel.Foreach does not wait so I tried to add async
reference in lambda expression but not sure A is correct or if is work.
Well for one thing you can pretend that Parallel.ForEach
awaits your async
functions, but it doesn't. Instead you want to write something like this:
await Task.WhenAll(customers.Select(async customer =>
{
var processedCustomer = await MethodB(customer);
inboundCustomersFiles.AddRange(processedCustomer);
}));
Task.WhenAll
behaves like Parallel.ForEach
, but it's awaitable and it also awaits every task you pass to it before completing its own task. Hence when your await Task.WhenAll
completes, all the inner tasks have completely completed as well.
the process in methodB customerRecord takes time
That is very ambiguous. If you mean it takes server and/or IO time then that's fine, that's what async
is for. If you mean it takes your CPU time (ie it processes data locally for a long time), then you should spin up a task on a thread pool and await its completion. Note that this is not necessarily the default thread pool! Especially if you're writing an ASP.NET (core) application, you want a dedicated thread pool just for this stuff.