I want to write a function equivalent to the Linux tail -n command in C++. While, I parsed over the data of that file line-by-line thereby incrementing the line count, if the file size gets really big(~gigabytes), this method will take a lot of time! Is there a better approach or a data structure to implement this function?
Here are my 2 methods:
int File::countlines()
{
int lineCount = 0;
string str;
if (file)
{
while (getline(file, str))
{
lineCount += 1;
}
}
return lineCount;
}
void File::printlines()
{
int lineCount = 0;
string line;
if (file)
{
lineCount = countlines();
file.clear();
file.seekg(ios::beg);
if (lineCount <= 10)
{
while (getline(file, line))
{
cout << line << endl;
}
}
else
{
int position = lineCount - 10;
while (position--)
{
getline(file, line);
}
while (getline(file, line))
{
cout << line << endl;
}
}
}
}
This method is time consuming if the file size increases, so I want to either replace it with another data structure, or write a more efficient code.
One of the things that is slowing down your program is reading the file twice, so you could keep the last n EOL positions (n=10 in your program) and the most convenient data structure is a circular buffer but this isn't provided by the standard library as far as I know (boost has one). It can be implemented by an std::vector with size n, with an index where a modulo of n is done after incrementing.
With that circular buffer, you can jump immediately to the lowest offset (next one if buffer is full) in the file and print the needed lines.