I am doing very large data processing (16GB) and I would like to try and speed up the operations through storing the entire file in RAM in order to deal with disk latency.
I looked into the existing libraries but couldn't find anything that would give me the flexibility of interface.
Ideally, I would like to use something with integrated read_line() method so that the behavior is similar to the standard file reading interface.
Homer512 answered my question somewhat, mmap is a nice option that results in caching.
Alternatively, Numpy has some interfaces for parsing byte streams from a file directly into memory and it also comes with some string-related operations.