Search code examples
c#httpsocketstcpclient

How to determine if an HTTP response is complete


I am working on building a simple proxy which will log certain requests which are passed through it. The proxy does not need to interfere with the traffic being passed through it (at this point in the project) and so I am trying to do as little parsing of the raw request/response as possible durring the process (the request and response are pushed off to a queue to be logged outside of the proxy).

My sample works fine, except for a cannot reliably tell when the "response" is complete so I have connections left open for longer than needed. The relevant code is below:

var request = getRequest(url);
byte[] buffer;
int bytesRead = 1;
var dataSent = false;
var timeoutTicks = DateTime.Now.AddMinutes(1).Ticks;

Console.WriteLine("   Sending data to address: {0}", url);
Console.WriteLine("   Waiting for response from host...");
using (var outboundStream = request.GetStream()) {
   while (request.Connected && (DateTime.Now.Ticks < timeoutTicks)) {
      while (outboundStream.DataAvailable) {
         dataSent = true;
         buffer = new byte[OUTPUT_BUFFER_SIZE];
         bytesRead = outboundStream.Read(buffer, 0, OUTPUT_BUFFER_SIZE);

         if (bytesRead > 0) { _clientSocket.Send(buffer, bytesRead, SocketFlags.None); }

         Console.WriteLine("   pushed {0} bytes to requesting host...", _backBuffer.Length);
      }

      if (request.Connected) { Thread.Sleep(0); }
   }
}

Console.WriteLine("   Finished with response from host...");
Console.WriteLine("   Disconnecting socket");
_clientSocket.Shutdown(SocketShutdown.Both);

My question is whether there is an easy way to tell that the response is complete without parsing headers. Given that this response could be anything (encoded, encrypted, gzip'ed etc), I dont want to have to decode the actual response to get the length and determine if I can disconnect my socket.


Solution

  • As David pointed out, connections should remain open for a period of time. You should not close connections unless the client side does that (or if the keep alive interval expires).

    Changing to HTTP/1.0 will not work since you are a server and it's the client that will specify HTTP/1.1 in the request. Sure, you can send a error message with HTTP/1.0 as version and hope that the client changes to 1.0, but it seems inefficient.

    HTTP messages looks like this:

    REQUEST LINE
    HEADERS
    (empty line)
    BODY
    

    The only way to know when a response is done is to search for the Content-Length header. Simply search for "Content-Length:" in the request buffer and extract everything to the linefeed. (But trim the found value before converting to int).

    The other alternative is to use the parser in my webserver to get all headers. It should be quite easy to use just the parser and nothing more from the library.

    Update: There is a better parser here: HttpParser.cs