I need to find the last timestamp in a very large log file with an unknown number of lines before I reach a line with a timestamp. I read the file backwards one line at a time, which is usually very quick except for one case. Sometimes, I will run into a very large block (thousands of lines) with a known repeating pattern (one entry shown below) and no timestamps:
goal_tolerance[0]:
name: joint_b
position: -1
velocity: -1
acceleration: -1
Since this is the only case where I have this kind of problem, I can just throw a piece of code into the loop that checks for it before searching the log line by line.
The number after goal_tolerance
is a counter, going up 1 each time the pattern repeats, so what I would like to do is use that number to calculate the beginning of the pattern. What I have now looks something like this:
if ' goal_tolerance' in line:
gtolnum = line[17:-3]
print gtolnum
startFrom = currentPosition - ((long(gtolnum) + 1) * 95)
break
However, this does not take into account the number of characters in the counter, so I end up running through the search loop several more times than necessary. Is there a fast way to include those characters in the calculation?
EDIT: I do not read the entire file to get to that point, since it is large and I have several hundred timestamps to search for in several hundred log files. My search function seeks to a position in the text file, then finds the beginning of a line near that point and reads it. The calculation is determining a file position I can use with .seek() based on the number of bytes or characters in the pattern.
I did some maths in the meantime and came up with a mathematical solution:
...
n = long(gtolnum)
q = len(gtolnum) # I'll refer to this as the number's "level"
x = n + 1 - 10**(q - 1) # Number of entries in the current level
c = x * (q - 1) # Additional digits in the current level
i = 2
p = 0
while i < q:
p += 9 * (q - i) * (10**(q - i)) # Additional digits in i levels previous
i += 1
startFrom = currentPosition - ((n + 1) * 95 + p + c)
...
Seems like there should be a much simpler solution, but I'm not seeing it. Perhaps a log function could help?