Search code examples
c++stringifstream

Is it better to read an entire file in std::string or to manipulate a file with std::ifstream?


I am actually developing scientific C++ simulation programs which read data, compute lots of values from them and finally store the results in a file. I wanted to know if reading all the data at once at the beginning of the program is faster than keep accessing the file via std::ifstream during the program.

The data I am using are not very big (several MB), but I do not even know what "big" is for a heap allocation...

I guess it depends on the data and so on (and after some test, effectively it depends), but I was wondering on what it was depending and whether there is a kind of general principle we should be following.

Long story short, the question is: does keeping a file opened and using file manipulators is faster than a potentially big heap allocation and using string manipulators?


Solution

  • Is reading all the data at once at the beginning of the program faster than keep accessing the file via std::ifstream during the program? Yes, probably it is. Keep in mind that working memory is fast and expensive, while storage memory (a hard drive) exists precisely to be cheap at the cost of being slow.

    What is "big" for a heap allocation? The operating system is going to try to fool your process into thinking that all existing working memory is free. This is not actually true, and the OS will "swap" one type of memory for the other if some process requests too much memory. But in principle, you should think that a heap allocation is big if it is comparable to the total size of working memory.

    Does keeping a file opened and using file manipulators is faster than a potentially big heap allocation and using string manipulators? No, it is not faster, but it has another advantage: it is memory-efficient. If you only put the needed data into memory in order to work with them, you are saving memory for all other processes in the machine (which could be other threads of your program, for instance). This is a very interesting property in order to have scalable software.