Search code examples
.netmultithreadingasynchronousasync-ctp

Converting non-trivial code to new .NET async pattern - how to handle yield loops


I am writing a library to work with Azure Table Storage. The basic pattern is that a given HTTP request returns a number results in the content stream, and a pointer to the next set of results in the headers. As the results are read from the stream, they are yielded. I am using the System.Net.Http library (previously Microsoft.Net.Http), which in the latest version removed the synchronous version of HttpClient.Send and other synchronous methods. The new version uses Tasks. I've used Tasks before, but not for something this complex, and I am having a hard time getting a start.

The calls that have been converted to the async pattern are: HttpClient.Send, response.Context.ContentReadSteam. I've cleaned up the code so that the important parts are shown.

var queryUri = _GetTableQueryUri(tableServiceUri, tableName, query, null, null, timeout);
while(true) {
    var continuationParitionKey = "";
    var continuationRowKey = "";
    using (var request = GetRequest(queryUri, null, action.Method, azureAccountName, azureAccountKey))
    {
        using (var client = new HttpClient())
        {
            using (var response = client.Send(request, HttpCompletionOption.ResponseHeadersRead))
            {
                continuationParitionKey = // stuff from headers
                continuationRowKey = // stuff from headers

                using (var reader = XmlReader.Create(response.Content.ContentReadStream))
                {
                    while (reader.Read())
                    {
                        if (reader.NodeType == XmlNodeType.Element && reader.Name == "entry" && reader.NamespaceURI == "http://www.w3.org/2005/Atom")
                        {
                            yield return XElement.ReadFrom(reader) as XElement;
                        }
                    }
                    reader.Close();
                }
            }
        }
    }
    if (continuationParitionKey == null && continuationRowKey == null)
        break;

    queryUri = _GetTableQueryUri(tableServiceUri, tableName, query, continuationParitionKey, continuationRowKey, timeout);
}

An example of one that I have successfully converted is below.

client.SendAsync(request, HttpCompletionOption.ResponseHeadersRead).ContinueWith(task =>
    {
        using (var response = task.Result)
        {
            if (response.StatusCode == HttpStatusCode.Created && action == HttpMethod.Post)
            {
                return XElement.Load(response.Content.ReadAsStreamAsync().Result);
            }
        }
    });

Does anyone have any suggestions on how to convert the loop/yield to the new pattern?

Thanks! Erick


Solution

  • As you've discovered, async doesn't work the greatest with yield right now. Even though they do similar code transformations, the goal is quite different.

    There are two solutions: one is to provide a buffer and use a producer/consumer type of approach. System.Tasks.Dataflow.dll is useful for managing buffers in complex code.

    The other solution is to write an "async enumerator." This is conceptually closer to what your code should be doing, but this solution is much more complex than the producer/consumer solution.

    The "async enumerator" type is discussed a bit in this video on Rx, and you can download it from an experimental Rx package (note that even though this is done by the Rx team, it actually does not use Rx).