Search code examples
c#asynchronousthreadpooliocp

Async IO without completion port?


// 1- 
using(FileStream file = File.Open("ConfusedDev", FileMode.Open)) {
    await file.ReadAsync(buffer, 0, 1024);
    Thread.Sleep(Timeout.Infinite); // threadpool shows 1 worker thread in use
}

// 2- 
using(FileStream file = new FileStream("ConfusedDev", FileMode.Open, FileAccess.Read, FileShare.Read, 1024, FileOptions.Asynchronous)) {
    await file.ReadAsync(buffer, 0, 1024);
    Thread.Sleep(Timeout.Infinite); // threadpool shows 1 async IO thread in use
}
  • Is it safe to say that case1 is equivalent of Task.Run(() => file.Read)? In other words, a thread in threadpool is blocked before read returns whereas case2 has no blocking thread as mentioned in this post: "There is no thread".
  • When to use case1(seems to be the default way introduced by Microsoft Doc) over case2. I am doing some work on the server side, case2 probably give me more spare threads for incoming requests?
  • Does this only happen to files? I tested against httpClient().GetAsync() it uses async IO thread by default, but maybe there is implementation where GetAsync() spin off another thread?

Solution

  • Most of your question seems to be answered simply by reviewing the source code:

    • Is it safe to say that case1 is equivalent of Task.Run(() => file.Read)? In other words, a thread in threadpool is blocked before read returns whereas case2 has no blocking thread as mentioned in this post: "There is no thread".

    File.OpenRead() does not pass FileOptions.Asynchronous to the FileStream constructor, so any asynchronous calls are implemented using blocking I/O in the thread pool. Specifically, the call to ReadAsync() ultimately winds up calling FileStream.BeginRead(), and if the instance wasn't created using FileOptions.Asynchronous, it delegates the read to the base class BeginRead(), which eventually executes this anonymous method as a task:

    delegate
    {
        // The ReadWriteTask stores all of the parameters to pass to Read.
        // As we're currently inside of it, we can get the current task
        // and grab the parameters from it.
        var thisTask = Task.InternalCurrent as ReadWriteTask;
        Contract.Assert(thisTask != null, "Inside ReadWriteTask, InternalCurrent should be the ReadWriteTask");
    
        // Do the Read and return the number of bytes read
        var bytesRead = thisTask._stream.Read(thisTask._buffer, thisTask._offset, thisTask._count);
        thisTask.ClearBeginState(); // just to help alleviate some memory pressure
        return bytesRead;
    }
    

    While I agree whole-heartedly with the "There is no thread" essay, it's important to avoid taking it too literally. IOCP is more efficient than dedicating threads to individual operations, but it still involves some threads. It's just that a much smaller pool of threads can be used, and any given thread is able to respond to the completion of a larger number of operations.

    • When to use case1(seems to be the default way introduced by Microsoft Doc) over case2. I am doing some work on the server side, case2 probably give me more spare threads for incoming requests?

    That question is really too broad, and primarily opinion-based anyway. But it's my opinion that you should always use FileOptions.Asynchronous for any significant file I/O. And if for some reason you decide to forego that and use File.OpenRead() anyway, then you should not bother to use any of the asynchronous calls.

    File.OpenRead() is convenient and fine for short programs that do very simple synchronous I/O. But you should never use File.OpenRead() if you are going to then call the asynchronous methods on the FileStream object (e.g. ReadAsync(), BeginRead(), etc.). If it's important enough that the code to operates asynchronously, then it's important enough to make sure it actually does so using the efficient asynchronous features in Windows.

    • Does this only happen to files? I tested against httpClient().GetAsync() it uses async IO thread by default, but maybe there is implementation where GetAsync() spin off another thread?

    The behavior you're asking about is, obviously, specific to the only difference shown in your code examples: using File.Open() (which doesn't pass FileOptions.Asynchronous) and using the FileStream constructor with the FileOptions.Asynchronous option. So it doesn't really make sense to ask whether this happens only to files. The File.Open() method and the FileStream object are by definition applicable only to files.

    That said, while if you were to find a different class that had a similar option (i.e. to enable asynchronous I/O or not) surely it would work identically (i.e. not use IOCP without the option enabled), in fact something like HttpClient or NetworkStream is built on top of the Socket class, which has no such option, and asynchronous operations are going to go through that class's implementation of asynchronous I/O, which does use IOCP always.

    So, no…you're not going to find an option in the HttpClient class that disables the use of IOCP for asynchronous operations.

    Of course, you can always wrap the synchronous calls yourself, to use the main thread pool instead of the IOCP thread pool, and that would then behave just like asynchronous calls on a non-asynchronous FileStream object.