Search code examples
c#httpwebrequestbinaryreader

BinaryReader reading different length of data depending on BufferSize


The issue is as follows, I am using an HttpWebRequest to request some online data from dmo.gov.uk. The response I am reading using a BinaryReader and writing to a MemoryStream. I have packaged the code being used into a simple test method:

public static byte[] Test(int bufferSize)
{
    var request = (HttpWebRequest)WebRequest.Create("http://www.dmo.gov.uk/xmlData.aspx?rptCode=D3B.2");
    request.Method = "GET";
    request.Credentials = CredentialCache.DefaultCredentials;

    var buffer = new byte[bufferSize];
    using (var httpResponse = (HttpWebResponse)request.GetResponse())
    {
        using (var ms = new MemoryStream())
        {
            using (var reader = new BinaryReader(httpResponse.GetResponseStream()))
            {
                int bytesRead;
                while ((bytesRead = reader.Read(buffer, 0, bufferSize)) > 0)
                {
                    ms.Write(buffer, 0, bytesRead);
                }
            }
            return ms.GetBuffer();
        }
    }
}

My real-life code uses a buffer size of 2048 bytes usually, however I noticed today that this file has a huge amount of empty bytes (\0) at the end which bloats the file size. As a test I tried increasing the buffer size to near-on the file size I expected (I was expecting ~80Kb so made the buffer size 79000) and now I get the right file size. But I'm confused, I expected to get the same file size regardless of the buffer size used to read the data.

The following test:

Console.WriteLine(Test(2048).Length);
Console.WriteLine(Test(79000).Length);
Console.ReadLine();

Yields the follwoing output:

131072
81341

The second figure, using the high buffer size is the exact file size I was expecting (This file changes daily, so expect that size to differ after today's date). The first figure contains \0 for everything after the file size expected.

What's going on here?


Solution

  • You should change ms.GetBuffer(); to ms.ToArray();.

    GetBuffer will return the entire MemoryStream buffer while ToArray will return all the values inside the MemoryStream.