Search code examples
c#async-awaittask-parallel-librarytpl-dataflow

TPL Dataflow TransformBlock Execution Sequence Seems Out Of Order / Async


I'm following along this MSDN Walkthrough - Walkthrough: Creating a Dataflow Pipeline. I created a single TransformBlock and executed it by doing a Post to it.

  // Process "The Adventurous Life of a Versatile Artist: Houdini" 
  //         by Harry Houdini.
  downloadString.Post("http://www.gutenberg.org/cache/epub/45370/pg45370.txt");

Then after that, I call the Complete method and have a Console.WriteLine("Press a key to exit:"); line.

Here's the complete code. You can also find it at this stage in this commit on my github repo.

using System;
using System.Net.Http;
using System.Threading.Tasks.Dataflow;

namespace Palindromes.ConsoleApp
{
  class Program
  {
    static void Main(string[] args)
    {
      // 
      // Create members of the Pipeline
      //

      // Download the requested resource as a string

      var downloadString = new TransformBlock<string, string>
        ( url =>
          {
            Console.WriteLine($"Downloading from {url}...");
            string result = null;
            using (var client = new HttpClient())
            {
              // Perform a synchronous call by calling .Result
              var response = client.GetAsync(url).Result;

              if (response.IsSuccessStatusCode)
              {
                var responseContent = response.Content;

                // read result synchronously by calling .Result 
                result = responseContent.ReadAsStringAsync().Result;
                if (!string.IsNullOrEmpty(result))
                  Console.WriteLine($"Downloaded {result.Length} characters...");

              }
            }
            return result;
          }
        );

      // Process "The Adventurous Life of a Versatile Artist: Houdini" 
      //         by Harry Houdini.
      downloadString.Post("http://www.gutenberg.org/cache/epub/45370/pg45370.txt");
      downloadString.Complete();

      Console.WriteLine("Press a key to exit:");
      Console.ReadKey();
    }
  }
}

When I execute this console app, I expect to see the output as follows.

Expected Output

Downloading from http://www.gutenberg.org/cache/epub/45370/pg45370.txt...
Downloaded 129393 characters...
Press a key to exit:

But here's the actual output. (I've run it several times with the same sequence of Console.WriteLine output showing up.

Actual Output

Press a key to exit:
Downloading from http://www.gutenberg.org/cache/epub/45370/pg45370.txt...
Downloaded 129393 characters...

Why is the Press a key to exit line getting executed before the TransformBlock's Console.WriteLines get called?

Shouldn't the TransformBlock's Console.WriteLines be called first since I am invoking it first, and since this is going to be part of a Pipeline? Also I don't have any async code as far as I can tell, and I don't fully know the inner workings of the TPL Dataflow, so why does this appear to be executing out of order?

Thank you!


Solution

  • Why is the Press a key to exit line getting executed before the TransformBlock's Console.WriteLines get called?

    The call to Console.WriteLine("Press a key to exit:") happens before the TransformBlock has completed the transform function. Each item posted to the TransfromBlock will be processed asyncrounously with respect to your primary context.

    Going forward if you want to wait for your pipeline to complete you will need to either block on its Completion Task or await the completion in an async method:

    private static async Task MainAsync() {
        // Process "The Adventurous Life of a Versatile Artist: Houdini" 
        //         by Harry Houdini.
        downloadString.Post("http://www.gutenberg.org/cache/epub/45370/pg45370.txt");
        downloadString.Complete();
    
        await downloadString.Completion;
    }