Search code examples
c#.nettcptcpclientmemorystream

Receiving data via TCP: MemoryStream contains more data than expected


I host a server which receives data from a remote TCP client (which I also control). Here is the method that handles incoming data:

private static async Task ReceiveDataFromRemoteSocket(
    Socket socket,
    int numBytesExpectedToReceive)
{
    int numBytesLeftToReceive = numBytesExpectedToReceive;

    using (MemoryStream memoryStream = new MemoryStream(numBytesExpectedToReceive))
    {
        byte[] dataBuffer = new byte[1024];

        ArraySegment<byte> dataBufferSegment = new ArraySegment<byte>(dataBuffer);          
        int totalBytesReceived = 0;

        while (numBytesLeftToReceive > 0)
        {
            Array.Clear(dataBuffer, 0, dataBuffer.Length);

            int numBytesReceived = await socket.ReceiveAsync(dataBufferSegment, SocketFlags.Partial);
            Console.WriteLine($"Received {numBytesReceived} bytes of data at {DateTime.UtcNow.ToShortTimeString()}.");

            totalBytesReceived += numBytesReceived;

            memoryStream.Write(
                dataBuffer,
                0,
                numBytesLeftToReceive < dataBuffer.Length ? numBytesLeftToReceive : dataBuffer.Length);
            numBytesLeftToReceive -= numBytesReceived;
        }
        Console.WriteLine($"Total number of bytes received, according to tally: {totalBytesReceived}.");
        Console.WriteLine($"Memory stream: Contains {memoryStream.Length} bytes' worth of data.");
    }
}

numBytesExpectedToReceive is information retrieved from the header.

Here is the output on my console:


Accepted connection request from XX.XX.XXX.XXX:56767 at 4/30/2019 10:39:11 AM.
Expecting to receive 41898 bytes' worth of data from XX.XX.XXX.XXX:56767.
Received 1024 bytes of data at 10:39 AM.
Received 1024 bytes of data at 10:39 AM.
Received 1024 bytes of data at 10:39 AM.
Received 1024 bytes of data at 10:39 AM.
Received 1024 bytes of data at 10:39 AM.
Received 1024 bytes of data at 10:39 AM.
Received 1024 bytes of data at 10:39 AM.
Received 416 bytes of data at 10:39 AM.
Received 1024 bytes of data at 10:39 AM.
Received 1024 bytes of data at 10:39 AM.
Received 1024 bytes of data at 10:39 AM.
Received 1024 bytes of data at 10:39 AM.
Received 1024 bytes of data at 10:39 AM.
Received 1024 bytes of data at 10:39 AM.
Received 1024 bytes of data at 10:39 AM.
Received 1024 bytes of data at 10:39 AM.
Received 1024 bytes of data at 10:39 AM.
Received 1024 bytes of data at 10:39 AM.
Received 1024 bytes of data at 10:39 AM.
Received 1024 bytes of data at 10:39 AM.
Received 1024 bytes of data at 10:39 AM.
Received 1024 bytes of data at 10:39 AM.
Received 1024 bytes of data at 10:39 AM.
Received 1024 bytes of data at 10:39 AM.
Received 1024 bytes of data at 10:39 AM.
Received 1024 bytes of data at 10:39 AM.
Received 1024 bytes of data at 10:39 AM.
Received 1024 bytes of data at 10:39 AM.
Received 1024 bytes of data at 10:39 AM.
Received 96 bytes of data at 10:39 AM.
Received 1024 bytes of data at 10:39 AM.
Received 1024 bytes of data at 10:39 AM.
Received 1024 bytes of data at 10:39 AM.
Received 1024 bytes of data at 10:39 AM.
Received 1024 bytes of data at 10:39 AM.
Received 1024 bytes of data at 10:39 AM.
Received 1024 bytes of data at 10:39 AM.
Received 1024 bytes of data at 10:39 AM.
Received 1024 bytes of data at 10:39 AM.
Received 1024 bytes of data at 10:39 AM.
Received 1024 bytes of data at 10:39 AM.
Received 1024 bytes of data at 10:39 AM.
Received 512 bytes of data at 10:39 AM.
Total number of bytes received, according to tally: 41984.
Memory stream: Contains 43434 bytes' worth of data.

As you can see, the memory stream contains 43434 bytes of data, even though I expect it to contain only 41984 bytes.

This causes a lot of issues, e.g. if I create a new instance of ZipArchive by writing new ZipArchive(memoryStream);, I end up with an InvalidDataException, even though I know that my remote TCP client has sent a valid zip file.

  1. Why does the memory stream contain more bytes than actually received via TCP?
  2. How can I remove these "junk data" (for lack of a better term), so that I can successfully reconstruct the data that was sent to me, e.g. by passing the memory stream into the ZipArchive constructor?

Solution

  • The problem is here where you write data:

    memoryStream.Write(
                dataBuffer,
                0,
                numBytesLeftToReceive < dataBuffer.Length ? numBytesLeftToReceive : dataBuffer.Length);
    

    You completely ignore the amount you received, instead you just check if there’s more data to be received than the buffet size and if there is you write the whole buffer.

    You can see in your output sometimes you don’t receive a full buffer. Yet you still write the whole buffer.

    Always write based on the amount you received. Don’t do any weird comparisons based on the length of the data:

    memoryStream.Write(
                dataBuffer,
                0,
                numBytesReceived);