While converting some older code to use async in c#, I started seeing problems in variations of return values from the Read() and ReadAsync() methods of the DeflateStream.
I thought that the transition from synchronous code like
bytesRead = deflateStream.Read(buffer, 0, uncompressedSize);
to it's equivalent asynchronous version of
bytesRead = await deflateStream.ReadAsync(buffer, 0, uncompressedSize);
should always return the same value.
See updated code added to bottom of question - that uses streams the correct way - hence making the initial question irrelevant
I found that after number of iterations this didn't hold true, and in my specific case was causing random errors in the converted application.
Am I missing something here?
Below is simple repro case (in a console app), where the Assert
will break for me in the ReadAsync
method on iteration #412, giving output that looks like this:
....
ReadAsync #410 - 2055 bytes read
ReadAsync #411 - 2055 bytes read
ReadAsync #412 - 453 bytes read
---- DEBUG ASSERTION FAILED ----
My question is, why is the DeflateStream.ReadAsync
method returning 453 bytes at this point?
Note: this only happens with certain input strings - the massive StringBuilder
stuff in the CreateProblemDataString
was the best way I could think of constructing the string for this post.
class Program
{
static byte[] DataAsByteArray;
static int uncompressedSize;
static void Main(string[] args)
{
string problemDataString = CreateProblemDataString();
DataAsByteArray = Encoding.ASCII.GetBytes(problemDataString);
uncompressedSize = DataAsByteArray.Length;
MemoryStream memoryStream = new MemoryStream();
using (DeflateStream deflateStream = new DeflateStream(memoryStream, CompressionMode.Compress, true))
{
for (int i = 0; i < 1000; i++)
{
deflateStream.Write(DataAsByteArray, 0, uncompressedSize);
}
}
// now read it back synchronously
Read(memoryStream);
// now read it back asynchronously
Task retval = ReadAsync(memoryStream);
retval.Wait();
}
static void Read(MemoryStream memoryStream)
{
memoryStream.Position = 0;
using (DeflateStream deflateStream = new DeflateStream(memoryStream, CompressionMode.Decompress, true))
{
byte[] buffer = new byte[uncompressedSize];
int bytesRead = -1;
int i = 0;
while (bytesRead > 0 || bytesRead == -1)
{
bytesRead = deflateStream.Read(buffer, 0, uncompressedSize);
System.Diagnostics.Debug.WriteLine("Read #{0} - {1} bytes read", i, bytesRead);
System.Diagnostics.Debug.Assert(bytesRead == 0 || bytesRead == uncompressedSize);
i++;
}
}
}
static async Task ReadAsync(MemoryStream memoryStream)
{
memoryStream.Position = 0;
using (DeflateStream deflateStream = new DeflateStream(memoryStream, CompressionMode.Decompress, true))
{
byte[] buffer = new byte[uncompressedSize];
int bytesRead = -1;
int i = 0;
while (bytesRead > 0 || bytesRead == -1)
{
bytesRead = await deflateStream.ReadAsync(buffer, 0, uncompressedSize);
System.Diagnostics.Debug.WriteLine("ReadAsync #{0} - {1} bytes read", i, bytesRead);
System.Diagnostics.Debug.Assert(bytesRead == 0 || bytesRead == uncompressedSize);
i++;
}
}
}
/// <summary>
/// This is one of the strings of data that was causing issues.
/// </summary>
/// <returns></returns>
static string CreateProblemDataString()
{
StringBuilder sb = new StringBuilder();
sb.Append("0601051081 ");
sb.Append(" ");
sb.Append(" 225021 0300420");
sb.Append("34056064070072076361102 13115016017");
sb.Append("5 192 230237260250 2722");
sb.Append("73280296 326329332 34535535");
sb.Append("7 3 ");
sb.Append(" 4");
sb.Append(" ");
sb.Append(" 50");
sb.Append("6020009 030034045 063071076 360102 13");
sb.Append("1152176160170 208206 23023726025825027227328");
sb.Append("2283285 320321333335341355357 622005009 0");
sb.Append("34053 060070 361096 130151176174178172208");
sb.Append("210198 235237257258256275276280290293 3293");
sb.Append("30334 344348350 ");
sb.Append(" ");
sb.Append(" ");
sb.Append(" ");
sb.Append(" 225020012014 046042044034061");
sb.Append("075078 361098 131152176160170 208195210 230");
sb.Append("231260257258271272283306 331332336 3443483");
sb.Append("54 29 ");
sb.Append(" ");
sb.Append(" 2");
sb.Append("5 29 06 0");
sb.Append("1 178 17");
sb.Append("4 205 2");
sb.Append("05 195 2");
sb.Append("31 231 23");
sb.Append("7 01 01 0");
sb.Append("2 260 26");
sb.Append("2 274 2");
sb.Append("72 274 01 01 0");
sb.Append("3 1 5 3 6 43 52 ");
return sb.ToString();
}
}
UPDATED CODE TO READ STREAMS INTO BUFFER CORRECTLY
Output now looks like this:
...
ReadAsync #410 - 2055 bytes read
ReadAsync #411 - 2055 bytes read
ReadAsync PARTIAL #412 - 453 bytes read, offset for next read = 453
ReadAsync #412 - 1602 bytes read
ReadAsync #413 - 2055 bytes read
...
static void Read(MemoryStream memoryStream)
{
memoryStream.Position = 0;
using (DeflateStream deflateStream = new DeflateStream(memoryStream, CompressionMode.Decompress, true))
{
byte[] buffer = new byte[uncompressedSize]; // buffer to hold known fixed size record.
int bytesRead; // number of bytes read from Read operation
int offset = 0; // offset for writing into buffer
int i = -1; // counter to track iteration #
while ((bytesRead = deflateStream.Read(buffer, offset, uncompressedSize - offset)) > 0)
{
offset += bytesRead; // offset in buffer for results of next reading
System.Diagnostics.Debug.Assert(offset <= uncompressedSize, "should never happen - because would mean more bytes read than requested.");
if (offset == uncompressedSize) // buffer full, complete fixed size record in buffer.
{
offset = 0; // buffer is now filled, next read to start at beginning of buffer again.
i++; // increment counter that tracks iteration #
System.Diagnostics.Debug.WriteLine("Read #{0} - {1} bytes read", i, bytesRead);
}
else // buffer still not full
{
System.Diagnostics.Debug.WriteLine("Read PARTIAL #{0} - {1} bytes read, offset for next read = {2}", i+1, bytesRead, offset);
}
}
}
}
static async Task ReadAsync(MemoryStream memoryStream)
{
memoryStream.Position = 0;
using (DeflateStream deflateStream = new DeflateStream(memoryStream, CompressionMode.Decompress, true))
{
byte[] buffer = new byte[uncompressedSize]; // buffer to hold known fixed size record.
int bytesRead; // number of bytes read from Read operation
int offset = 0; // offset for writing into buffer
int i = -1; // counter to track iteration #
while ((bytesRead = await deflateStream.ReadAsync(buffer, offset, uncompressedSize - offset)) > 0)
{
offset += bytesRead; // offset in buffer for results of next reading
System.Diagnostics.Debug.Assert(offset <= uncompressedSize, "should never happen - because would mean more bytes read than requested.");
if (offset == uncompressedSize) // buffer full, complete fixed size record in buffer.
{
offset = 0; // buffer is now filled, next read to start at beginning of buffer again.
i++; // increment counter that tracks iteration #
System.Diagnostics.Debug.WriteLine("ReadAsync #{0} - {1} bytes read", i, bytesRead);
}
else // buffer still not full
{
System.Diagnostics.Debug.WriteLine("ReadAsync PARTIAL #{0} - {1} bytes read, offset for next read = {2}", i+1, bytesRead, offset);
}
}
}
}
Damien's comments are exactly correct. But, your mistake is a common enough one and IMHO the question deserves an actual answer, if for no other reason than to help others who make the same mistake more easily find the answer to the question.
So, to be clear:
As is true for all of the stream-oriented I/O methods in .NET where one provides a byte[]
buffer and the number of bytes read is returned by the method, the only assumptions you can make about the number of bytes are:
When reading using any of these methods, you cannot even count on the same method always returning the same number of bytes (depending on context…obviously in some cases, this is in fact deterministic, but you should still not rely on that), and there is no guarantee of any sort that different methods, even those which are reading from the same source, will always return the same number of bytes as some other method.
It is up to the caller to read the bytes as a stream, taking into account the return value specifying the number of bytes read for each call, and reassembling those bytes in whatever manner is appropriate for that particular stream of bytes.
Note that when dealing with Stream
objects, you can use the Stream.CopyTo()
method. Of course, it only copies to another Stream
object. But in many cases, the destination object can be used without treating it as a Stream
. E.g. you just want to write the data as a file, or you want to copy it to a MemoryStream
and then use the MemoryStream.ToArray()
method to turn that into an array of bytes (which you can then access without any concern about how many bytes have been read in a given read operation…by the time you get to the array, all of them have been read :) ).