Search code examples
c#asynchronouscompiler-constructionndepend

Why does the async keyword generate an enumerator & additional struct when compiled?


If I create a simple class like the following:

public class TestClass
{
    public Task TestMethod(int someParameter)
    {
        return Task.FromResult(someParameter);
    }

    public async Task TestMethod(bool someParameter)
    {
        await Task.FromResult(someParameter);
    }
}

and examine it within NDepend, it shows that the TestMethod taking a bool and being async Task has a struct generated for it with an enumerator, the enumerator state machine and some additional stuff.

enter image description here

Why does the compiler generate a struct called TestClass+<TestMethod>d__0 with an enumerator for the async method?

It seems to generate more IL than what the actual method produces. In this example, the compiler generates 35 lines of IL for my class, while it generates 81 lines of IL for the struct. It's also increasing the complexity of the compiled code and causing NDepend to flag it for several rule violations.


Solution

  • This is because the async and await keywords are just syntactical sugar for something called coroutines.

    There are no special IL instructions to support the creation of asynchronous methods. Instead, an async method can be seen as kind of a state machine somehow.

    I will try to make this example as short as possible:

    [TestClass]
    public class AsyncTest
    {
        [TestMethod]
        public async Task RunTest_1()
        {
            var result = await GetStringAsync();
            Console.WriteLine(result);
        }
    
        private async Task AppendLineAsync(StringBuilder builder, string text)
        {
            await Task.Delay(1000);
            builder.AppendLine(text);
        }
    
        public async Task<string> GetStringAsync()
        {
            // Code before first await
            var builder = new StringBuilder();
            var secondLine = "Second Line";
    
            // First await
            await AppendLineAsync(builder, "First Line");
    
            // Inner synchronous code
            builder.AppendLine(secondLine);
    
            // Second await
            await AppendLineAsync(builder, "Third Line");
    
            // Return
            return builder.ToString();
        }
    }
    

    This is some async code as you've probably become used to: Our GetStringAsync method at first creates a StringBuilder synchronously, then it awaits some asynchronous methods and finally it returns the result. How would this be implemented if there was no await keyword?

    Add the following code to the AsyncTest class:

    [TestMethod]
    public async Task RunTest_2()
    {
        var result = await GetStringAsyncWithoutAwait();
        Console.WriteLine(result);
    }
    
    public Task<string> GetStringAsyncWithoutAwait()
    {
        // Code before first await
        var builder = new StringBuilder();
        var secondLine = "Second Line";
    
        return new StateMachine(this, builder, secondLine).CreateTask();
    }
    
    private class StateMachine
    {
        private readonly AsyncTest instance;
        private readonly StringBuilder builder;
        private readonly string secondLine;
        private readonly TaskCompletionSource<string> completionSource;
    
        private int state = 0;
    
        public StateMachine(AsyncTest instance, StringBuilder builder, string secondLine)
        {
            this.instance = instance;
            this.builder = builder;
            this.secondLine = secondLine;
            this.completionSource = new TaskCompletionSource<string>();
        }
    
        public Task<string> CreateTask()
        {
            DoWork();
            return this.completionSource.Task;
        }
    
        private void DoWork()
        {
            switch (this.state)
            {
                case 0:
                    goto state_0;
                case 1:
                    goto state_1;
                case 2:
                    goto state_2;
            }
    
            state_0:
                this.state = 1;
    
                // First await
                var firstAwaiter = this.instance.AppendLineAsync(builder, "First Line")
                                            .GetAwaiter();
                firstAwaiter.OnCompleted(DoWork);
                return;
    
            state_1:
                this.state = 2;
    
                // Inner synchronous code
                this.builder.AppendLine(this.secondLine);
    
                // Second await
                var secondAwaiter = this.instance.AppendLineAsync(builder, "Third Line")
                                                .GetAwaiter();
                secondAwaiter.OnCompleted(DoWork);
                return;
    
            state_2:
                // Return
                var result = this.builder.ToString();
                this.completionSource.SetResult(result);
        }
    }
    

    So obviously the code before the first await keyword just stays the same. Everything else is converted to a state machine which uses goto statements to execute your previous code piecewise. Every time one of the awaited tasks is completed, the state machine advances to the next step.

    This example is oversimplified to clarify what happens behind the scenes. Add error handling and some foreach-Loops in your async method, and the state machine gets much more complex.

    By the way, there is another construct in C# that does such a thing: the yield keyword. This also generates a state machine and the code looks quite similar to what await produces.

    For further reading, look into this CodeProject which takes a deeper look into the generated state machine.