I have a long running job I need to run once for each item in a collection.
I'd like to do these jobs concurrently, although the entire program can wait for all jobs to complete. (I may change that later, but for now, I'm keeping the problem simple)
Following some help already given, I've got the following pattern:
public static void DoWork()
{
//Get a collection of items
var items = GetMyItems();
}
private async void DoStuffWithItems(ICollection<MyItem> items)
{
var tasks = items.Select (i => DoStuffWithItem(i));
await Task.WhenAll(tasks);
}
private Task DoStuffWithItem(MyItem item)
{
//LongRunningTask
return Task.Run(async () =>
{
var returnObject = await LongRunningAsyncMethod(item);
});
}
If I understand it right though, THIS does each task one at a time still - so pointless.
Someone suggested I used Parallel.ForEach in combination with async await - the Parallel-ForEach pattern is easy enough:
public static void DoWork()
{
//Get a collection of items
var items = GetMyItems();
Parallel.ForEach(items, (item) =>
{
var returnObject = LongRunningMethod(item);
}
}
Now if I understand THIS correctly, I'll kick off a separate thread for each item to do its thing, but while it is doing this, the main thread will wait
But what if I want the rest of the program to carry on while DoWork is waiting for the Parallel.ForEach to complete?
public static void DoWork()
{
//Get a collection of items
var items = GetMyItems();
DoINeedThisMethod(items);
}
private async void DoINeedThisMethod(ICollection<MyItem> items)
{
await Task.Run(() => {
DoStuffWithItems(items);
});
}
private async void DoStuffWithItems(ICollection<MyItem> items)
{
var newobjs = new List<MyItem>();
Parallel.ForEach(items, (item) =>
{
var bc = LongRunningAsyncMethod(item);
});
}
It doesn't feel quite right - and I have an async void in there which isn't good practice. But am I along the right lines here? If I understand correctly, in DoINeedThisMethod, I'm spinning up a new thread (via Task) to perform the Parallel.ForEach call, but the use of "wait" in there means that the main thread will now continue while the Parallel.ForEach completes.
Actually, your first assumption
If I understand it right though, THIS does each task one at a time still
is wrong as can easily be proven:
private static async Task Main(string[] args)
{
async Task DoStuffWithItem(int i)
{
// long running task
Console.WriteLine($"processing item {i} started");
await Task.Delay(500);
Console.WriteLine($"processing item {i} finished");
}
var items = new List<int> { 1, 2, 3 };
var tasks = items.Select(i => DoStuffWithItem(i)).ToList();
await Task.WhenAll(tasks);
Console.WriteLine("\nFinished");
}
yields the following output:
processing item 1 started
processing item 2 started
processing item 3 started
processing item 2 finished
processing item 1 finished
processing item 3 finished
Finished
So the tasks are executed concurrently...
Note: as explained in a comment below, the 'ToList()' causes the Linq Query to be executed immediately (which starts the tasks immediately). Without the 'ToList()', the tasks will only be started upon executing 'await Task.WhenAll(tasks)'