Search code examples
c#async-awaitdeadlocktcpclientcontextswitchdeadlock

TcpClient Exception Deadlock


I have a curious behaviour in some code I inherited - simplified example below which demonstrates the issue if built into a plain Console App.

WhoIs hits its usage allowance on the 5th call - and returns a message + shuts the socket. Using ReadLineAsync this produces a SocketException and some IOExceptions - sometimes these are caught in the catch block and everything is as it should be, most times they are not caught and the program simply hangs - Break all in the VS debugger shows me on one of the calls to Console.WriteLine on the main thread. This behaviour persists when running the .exe file directly outside the debugger.

Can anyone see what/why this is happening?

I can fix my problem practically by using Peek() but I'd like to know what is going on with the exception not being caught - and the "deadlock". Presumably it is some kind of threading or context issue. If it's something I'm doing, I'd like to know what so I can avoid it elsewhere!

using System;
using System.IO;
using System.Net.Sockets;
using System.Text;
using System.Threading.Tasks;

namespace AsyncIssueConsoleApplication
{
class Program
{
    static void Main(string[] args)
    {
        Console.WriteLine(Task.Run(() => LookupAsync("elasticsearch.org")).Result);
        Console.WriteLine(Task.Run(() => LookupAsync("elasticsearch.org")).Result);
        Console.WriteLine(Task.Run(() => LookupAsync("elasticsearch.org")).Result);
        Console.WriteLine(Task.Run(() => LookupAsync("elasticsearch.org")).Result);
        Console.WriteLine(Task.Run(() => LookupAsync("elasticsearch.org")).Result);
        Console.WriteLine(Task.Run(() => LookupAsync("elasticsearch.org")).Result);
        Console.WriteLine(Task.Run(() => LookupAsync("elasticsearch.org")).Result);
        Console.WriteLine(Task.Run(() => LookupAsync("elasticsearch.org")).Result);
        Console.WriteLine(Task.Run(() => LookupAsync("elasticsearch.org")).Result);
        Console.WriteLine(Task.Run(() => LookupAsync("elasticsearch.org")).Result);
        Console.WriteLine(Task.Run(() => LookupAsync("elasticsearch.org")).Result);
        Console.ReadLine();
    }

    private static async Task<string> LookupAsync(string domain)
    {
        StringBuilder builder = new StringBuilder();
        TcpClient tcp = new TcpClient();
        await tcp.ConnectAsync("whois.pir.org", 43).ConfigureAwait(false);
        string strDomain = "" + domain + "\r\n";
        byte[] bytDomain = Encoding.ASCII.GetBytes(strDomain.ToCharArray());
        try
        {
            using (Stream s = tcp.GetStream())
            {
                await s.WriteAsync(bytDomain, 0, strDomain.Length).ConfigureAwait(false);
                using (StreamReader sr = new StreamReader(s, Encoding.ASCII))
                {
                    try
                    {
                        //This is fine
                        /*while (sr.Peek() >= 0)
                        {
                            builder.AppendLine(await sr.ReadLineAsync());
                        }*/

                        //This isn't - produces SocketException which usually isn't caught below
                        string strLine = await sr.ReadLineAsync().ConfigureAwait(false);
                        while (null != strLine)
                        {
                            builder.AppendLine(strLine);
                            strLine = await sr.ReadLineAsync().ConfigureAwait(false);
                        }
                    }
                    catch (Exception e)
                    {
                        //Sometimes the SocketException/IOException is caught, sometimes not
                        return builder.ToString();
                    }
                }
            }

        }
        catch (Exception e)
        {
            return builder.ToString();
        }
        return builder.ToString();
    }
}
}

Suggested duplicate Q&A may relate but doesn't answer this query that I can see, certainly not fully: i.e. what I would need to do about the SynchronizationContext - I am already using ConfigureAwait(false).

When the code is deadlocked as described above, the stack trace is:

mscorlib.dll!System.Threading.Monitor.Wait(object obj, int millisecondsTimeout, bool exitContext)   Unknown
mscorlib.dll!System.Threading.Monitor.Wait(object obj, int millisecondsTimeout) Unknown
mscorlib.dll!System.Threading.ManualResetEventSlim.Wait(int millisecondsTimeout = -1, System.Threading.CancellationToken cancellationToken) Unknown
mscorlib.dll!System.Threading.Tasks.Task.SpinThenBlockingWait(int millisecondsTimeout, System.Threading.CancellationToken cancellationToken)    Unknown
mscorlib.dll!System.Threading.Tasks.Task.InternalWait(int millisecondsTimeout = -1, System.Threading.CancellationToken cancellationToken)   Unknown
mscorlib.dll!System.Threading.Tasks.Task<string>.GetResultCore(bool waitCompletionNotification = true)  Unknown
mscorlib.dll!System.Threading.Tasks.Task<System.__Canon>.Result.get()   Unknown
AsyncIssueConsoleApplication.exe!AsyncIssueConsoleApplication.Program.Main(string[] args = {string[0]}) Line 18 C#

The IOException is: {"Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host."}

The SocketException is: {"An established connection was aborted by the software in your host machine"}


Solution

  • I can reproduce it now. In order to see where it is hanging I switched to synchronous IO. It is a common debugging issue with async IO that you cannot see what IOs are currently pending. This is a key reason why you might not want to use async IO in the first place.

    enter image description here

    enter image description here

    It's hanging because the remote server does not close the connection. The ReadLine call would only end if the remote side closed the connection.

    This could be a bug in the rate limiting code. It also could be an expected behavior of the protocol. Maybe you are meant to send the next request now? Or maybe you are supposed to detect the rate limiting from the previous line and shut down yourself.

    This is not a threading issue. There is no concurrency going on at all. All LookupAsync instances are running sequentially.

    I also tried properly closing the TcpClients in case the remote server behaves differently in the face of multiple connections. There was no effect. You should dispose your resources in any case, though. This is a severe leak.