Search code examples
c#.netlinqasynchronoustask

Can I setup an async operation to happen automatically after another async operation has completed?


I have a bit of a weird case. I have a list that looks something like:

var myStuff = [A:1, A:2, A:3, B:1, B:2, C:1, C:2, C:3, C:4];

For all letters, I want to start Tasks for each version except for the last version. I want to run the last version when the previous versions have finished. To explain better:

  • Run a task for A:1
  • Run a task for A:2
  • When both those tasks finish, run a task for A:3

Repeat for B and C, ideally having A:1, B:1, C:1, A:2, B:2 etc, all running at the same time.

I figure I can use a Linq Groupby to get the list into 3 groups, sort by version and then start tasks for all but the last. I'm not sure how to do this in a way where I can both have the last version (A:3, B:2, C:4) wait for the previous tasks while ALSO making sure A:1, B:1, C:1, A:2, B:2, C:2 etc all run at the same time.

Is this possible?


Solution

  • I'll start by talking about some basics. Normally, when working with async, you want to await your Tasks:

    await SomeMethodAsync(myData);
    

    The first wrinkle here is rather than just one item, you have a whole array. Even assuming you know how many items you have, you would NOT want to do something like this:

    await SomeMethodAsync(myData[0]);
    await SomeMethodAsync(myData[1]);
    await SomeMethodAsync(myData[2]);
    

    You would also NOT want to do something like this:

    foreach(var item in myData)
    {
        await SomeMethodAsync(item);
    }
    

    (Sadly, I suspect code like this is all too common.)

    Either of those effectively set you back to running things in sequence. Instead, remembering async methods always return a Task, you might do something like this:

    await Task.WhenAll(myData.Select(async item => await SomeMethodAsync(item));
    

    which maybe could be simplified like so:

    await Task.WhenAll(myData.Select(SomeMethodAsync));
    

    (Not sure on this: we still return a Task, but we omit the await on each call. It seems to work, in that multiple runs come back in different orders (definitely asynchronous), but I'll need to dig deeper some time to see what downsides there might be.)

    This is an improvement, but we still have a ways to go. One thing still missing is you need to exclude the last item in the sequence so it doesn't run until the others finish. Thankfully, C# has some nice range and index support to make this easier;

    await Task.WhenAll(myData[..^1].Select(async item => SomeMethodAsync(item));
    await SomeMethodAsync(myData[^1]);
    

    Getting very close now, but this is just one group within the data. Let's extend it again to support multiple groups.

    Unfortunately, I don't know how the A:1, A:2, B:1, etc, items are structured in the original code, since what we see in the question does not compile. For this question, I'll just assume they're strings. This gives me enough I can also start using an interpretation of the myStuff variable from the question for the example:

    var myStuff = new string[] {"A:1", "A:2", "A:3", "B:1", "B:2", "C:1", "C:2", "C:3", "C:4"};
    var groups = myStuff.GroupBy(s => s.Split(':')[0]);
    await Task.WhenAll( 
        groups.Select(async d => {
            // need an array or list instead of IGrouping enumerable for index support
            var data = d.ToList(); 
            await Task.WhenAll(data[..^1].Select(async item => await SomeMethodAsync(item));
            await SomeMethodAsync(data[^1]);
        })
    );
    

    And now I think we've covered all the details, including allowing separate groups to also run asynchronous, and not just the group contents.

    Note, this relies on .GroupBy() to preserve the original order in which items are encountered. This does happen, but it's not documented, meaning it's possible it could change in the future.

    See it work below. Note that each run will give a different order for the results but within each group A:3, B:2, and C:4 are always last (it may still be possible to have, say, all of group A finish before, say, C:3, but this is still allowed based on the problem statement):

    https://dotnetfiddle.net/b9S0W7

    In addition to seeing the results out of order, note the run time usually takes a little longer than 3 seconds. We know each group has two delays with a random value of up to 3 seconds, where average is 1.5s each, or 3 seconds total. Any given run also has three groups, where the total time for the run is the longest of the three groups. So a little over three seconds is right in line with expectations for asynchronous code. If the groups were asynchronous within each group, but run in sequence one group after the other, we'd expect run times closer to 9 seconds. If the groups ran items in sequence, but were themselves asynchronous, we'd expect runs of about 6 seconds (4 * 1.5 seconds each item for the largest group). If everything was in sequence, we'd expect 18 seconds. So results of a little over 3 seconds (up to 6 max) prove out things are working the way we want.


    Needing to get the list for each group in the above code isn't great, but efficiently excluding the last item from a set of unknown size is tricky (Last() and Count() aren't great in that situation, because they force extra enumerations). You can do it by making an iterator to buffer an item as you go, returning the buffered item instead of the current item, and then using the final buffered item at the end. However, the maintenance and complexity cost of that code probably outweighs the win in this case.

    But based on comments, the extra enumerations may not be a problem and this could work even better by removing the list allocation for each group:

    var myStuff = new string[] {"A:1", "A:2", "A:3", "B:1", "B:2", "C:1", "C:2", "C:3", "C:4"};
    var groups = myStuff.GroupBy(s => s.Split(':')[0]);
    await Task.WhenAll( 
        groups.Select(async d => {
            var count = d.Count();
            await Task.WhenAll(d.Take(count-1).Select(async item => await SomeMethodAsync(item)));
            await SomeMethodAsync(d.Last());
        })
    );