Search code examples
pythonpython-3.xcachingin-memory

Python in-memory files for caching large files


I am doing very large data processing (16GB) and I would like to try and speed up the operations through storing the entire file in RAM in order to deal with disk latency.

I looked into the existing libraries but couldn't find anything that would give me the flexibility of interface.

Ideally, I would like to use something with integrated read_line() method so that the behavior is similar to the standard file reading interface.


Solution

  • Homer512 answered my question somewhat, mmap is a nice option that results in caching.

    Alternatively, Numpy has some interfaces for parsing byte streams from a file directly into memory and it also comes with some string-related operations.