I have some load(..)
method that loads the file's content into std::wstring
. It usually process quite big files (up to few MB) and I use it extensively so I look for optimization possibilities (without breaking the simplicity of "load the file content into string" and without generating additional dependencies to other libraries).
Should I use move semantic here (I'm not quite familiar with it)? Or the way I've written it is close to the most time-optimal because of return-value-optimization that compiler will perform?
inline static std::wstring load(std::wstring filePath) {
std::wifstream file(filePath.c_str());
if(file){
std::wstring fileString;
fileString.reserve((size_t)file.tellg());
file.seekg(0);
while(!file.eof()){
fileString += file.get();
}
file.close();
return fileString;
}
file.close();
ERROR_HANDLE(L"File could not be open:\n" + filePath);
return L"";
}
The code as implemented will keep allocating new memory for the strings storing the file's content. For the size of strings quoted the actual allocation overhead is likely to be negligible. However, there is a fair chance that the memory is not mapped to any cache and accessing it may correspondingly be relatively expensive: already mapped memory needs to be evicted. Assuming code calling load()
essentially just processes the content of the string, keeping the same string may have a performance advantage. A corresponding implementation could look like this:
inline bool load(std::string const& path, std::wstring& content) {
std::wifstream in(path.c_str());
if (in) {
content.assign(std::istreambuf_iterator<wchar_t>(in),
std::istreambuf_iterator<wchar_t>());
return true;
}
else {
return false;
}
}
There shouldn't be any need to reserve()
memory as the content
should quickly settle on a capacity sufficient for the files being processed.
Sticking with the original interface, i.e., returning a string, will easily move the string anyway: when returning a temporary object or a named variable a move constructor will be used. Ideally, the move is avoided, too, by laying out the implementation to be viable for copy-elision. For example, you could always return the same object. Copy-elision normally happens even for compilers not implementing rvalue references or for classes without move constructors. I'd implement your function like this:
inline std::wstring load(const std::string& path) {
std::wifstream in(path.c_str());
std::wstring result;
if (in) {
// possibly result.reserve() capacity
result.assign(std::istreambuf_iterator<wchar_t>(in),
std::istreambuf_iterator<wchar_t>());
}
return result;
}