As the title says I found a problem. Little back story first:
We have file.txt
looking like this:
aaaabb
ccccddd
eeeefffffff
There are many ways to read this text line-by-line, one of which is this:
StreamReader sr = new StreamReader("file.txt");
while(!sr.EndOfStream)
{
string s = sr.ReadLine();
}
sr.Close();
Works. s
gets each line.
Now I need the first 4 letters as bytes and the rest as string. After looking up things and experimenting a little, I found that the easiest way is this:
FileStream fs = new FileStream("file.txt", FileMode.Open);
StreamReader sr = new StreamReader(fs);
byte[] arr = new byte[4];
fs.Read(arr, 0, 4);
string s = sr.ReadLine();
sr.Close();
fs.Close();
Works. arr
contains the first 4 letters as bytes and the rest of the line is saved in s
. This is only a single line. If we add the while
:
FileStream fs = new FileStream("file.txt", FileMode.Open);
StreamReader sr = new StreamReader(fs);
while(!sr.EndOfStream)
{
byte[] arr = new byte[4];
fs.Read(arr, 0, 4);
string s = sr.ReadLine();
}
sr.Close();
fs.Close();
Now there's a problem. Now arr
doesn't get anything and s
reads the whole line including the first 4 letters. Even more strange that if I use while(true)
(and I assume anything else that is not the example) than it works as intended, 4 characters as bytes and rest is string, and this is the same for every line.
Question is that what am I missing? Why is this happening? How do I solve this? Or is it possible that this is a bug?
The problem here is simple buffering. When you attach your StreamReader
to the FileStream
, it ends up consuming a block from the file, thus advancing the current Position
of FileStream
. With your example file and the default buffer size, once the StreamReader
attaches itself, it basically consumes the entire file into a buffer, leaving the FileStream
at its EOF. When you then attempt to read 4 bytes from the FileStream
directly via your fs
reference, there's nothing left to consume. The following ReadLine
works on your sr
reference as that's reading from the buffered file content.
Here's a step-by-step breakdown of what's happening:
fs
opens up the file and sits at Position
0.sr
wraps up fs
and the call to EndOfStream
ends up consuming (in this case) 27 bytes into its internal buffer. At this point, the fs
Position
now sits at EOF.fs
directly, but its at EOF with no more bytes.sr.ReadLine
reads from the buffer it built up in step #2 and all works well.To fix your specific error case, you could change your byte array to a char array and use sr.Read
instead. i.e.
char[] arr = new char[4];
sr.Read(arr, 0, 4);