Search code examples
c++text-filesbinaryfilesifstreamis-empty

How Can I Detect That a Binary File Has Been Completely Consumed?


If I do this:

ofstream ouput("foo.txt");

output << 13;
output.close();

ifstream input("foo.txt");
int dummy;

input >> dummy;

cout << input.good() << endl;

I'll get the result: "0"

However if I do this:

ofstream ouput("foo.txt", ios_base::binary);
auto dummy = 13;

output.write(reinterpret_cast<const char*>(&dummy), sizeof(dummy));
output.close();

ifstream input("foo.txt", ios_base::binary);

input.read(reinterpret_cast<char*>(&dummy), sizeof(dummy));
cout << input.good() << endl;

I'll get the result: "1"

This is frustrating to me. Do I have to resort to inspecting the ifstream's buffer to determine whether it has been entirely consumed?


Solution

  • You do not need to resort to inspecting the buffer. You can determine if the whole file has been consumed: cout << (input.peek() != char_traits<char>::eof()) << endl This uses: peek, which:

    Reads the next character from the input stream without extracting it

    good in the case of the example is:

    • Returning false after the last extraction operation, which occurs because the int extraction operator has to read until it finds a character that is not a digit. In this case that's the EOF character, and when that character is read even as a delimiter the stream's eofbit is set, causing good to fail
    • Returning true after calling read, because read extracts exactly sizeof(int)-bytes so even if the EOF character is the next character it is not read, leaving the stream's eofbit unset and good passing

    peek can be used after either of these and will correctly return char_traits<char>::eof() in both cases. Effectively this is inspecting the buffer for you, but with one vital distinction for binary files: If you were to inspect a binary file yourself you'd find that it may contain the EOF character. (On most systems that's defined as 0xFF, 4 of which are in the binary representation of -1.) If you are inspecting the buffer's next char you won't know whether that's actually the end of the file or not.

    peek doesn't just return a char though, it returns an int_type. If peek returns 0x000000FF then you're looking at an EOF character, but not the end of file. If peek returns char_traits<char>::eof() (typically 0xFFFFFFFF) then you're looking at the end of the file.