I have some log files I would like to sift through. The content is exactly what you expect in a log file: many single lines of comma separated text. The files are about 4 gigs each. File.each_line or foreach takes about 20 minutes for one of them.
Since a simple foreach seems... simple (and slow), I was thinking that two separate threads might be able to work on the same file if I could only tell them where to start. But based on my (limited) knowledge, I can't decide if this is even possible.
Is there a way to start reading the file at an arbitrary line?
For lines, it might be a bit difficult, but you can seek within a file to a certain byte.
IO#seek
(link) and IO#pos
(link) will both allow you to seek to a given byte within the file.