First of all I understand that I can solve this issue using different ways. I guess that this issue exists only because of using different methods in incorrect way. But I want to find out what exactly happened in my example.
I was using StreamReader for reading file. In order to get bytes from it I decided to use BaseStream.Read:
int length = (int)reader.BaseStream.Length;
byte[] file = new byte[length];
while(!reader.EndOfStream)
{
int readBytes = reader.BaseStream.Read(file, 0,
(length-offset)>bufferSize?bufferSize:(length - offset));
for (int i = 0; i<readBytes; i++)
{
...
}
offset += readBytes;
}
BaseStream.Read refuses to get last 1024 bytes when property StreamReader.EndOfStream was used before reading. Later I've found information, that EndOfStream trying to read 1 byte, but in fact he reads 1024 bytes due performance. Apparently this 1kb become impossible to reach.
EDIT: If I delete reader.EndOfStream property in code, reader.BaseStream.Read will work correctly. That was the main point of question.
Again, I understand, that this code example is absolutely inefficient. I'm just trying to understand how streams work in that example and does this issue exist because of bad code only (or StreamReader.BaseStream has some issues)? Thanks in advance.
It is not StreamReader.BaseStream
has some issues but is a problem in your code. When you work directly with the Stream
wraped inside StreamReader
.
From MSDN about StreamReader.DiscardBufferedData:
You need to call this method only when the position of the internal buffer and the BaseStream do not match. These positions can become mismatched when you read data into the buffer and then seek a new position in the underlying stream.
That mean, in your case, when the Stream
already reached end position, the position of StreamReader
internal buffer still remain the value before you read the underlying stream directedly, therefore reader.EndOfStream
still = false
. That why you can not finish the loop.
Edit:
I think you are missing something, I give you this code to prove that the file is successfully reached to the end. Run it and you see that your app repeatly say: I'm at the end of the file!
static void Main()
{
using (StreamReader reader = new StreamReader(@"yourFile"))
{
int offset = 0;
int bufferSize = 102400;
int length = (int)reader.BaseStream.Length;
byte[] file = new byte[length];
while (!reader.EndOfStream)
{
// Add this line:
Console.WriteLine(reader.BaseStream.Position);
Console.ReadLine();
int readBytes = reader.BaseStream.Read(file, 0,
(length - offset) > bufferSize ? bufferSize : (length - offset));
string str = Encoding.UTF8.GetString(file, 0, readBytes);
offset += readBytes;
if (reader.BaseStream.Position == length)
{
Console.WriteLine("I'm at the end of the file! Current Tickcount: " + Environment.TickCount);
Thread.Sleep(100);
}
}
}
}
Edit 2
But still , offset and length should be equal, im my case length - offset = 1024 (in case of files that bigger than 1kb). Maybe I'm doing something wrong, but if I use files with size less than 1kb, readBytes always equals 0.
That because your first call to while (!reader.EndOfStream)
, the reader have to read the file (this case is 1024 bytes - read bytes to internal buffer) to detemine if file is ended or not (see two lines of code I add above), after it read the file is seeked 1024 bytes, that why length - offset = 1024
, and if your file small than 1kb then with this first call, it already seek to end of file. This is where you lost data.
The second call to it, it don't seek because you don't send any read request to the reader, so it consider unchanged, then it don't need read file again to check if at the end of file, that why the second call don't loss data.