Search code examples
pythonfilememorybuffermicropython

How do I use a file like a memory buffer in Python?


I don't know the correct terminology, maybe it's called page file, but I'm not sure. I need a way to use an on-disk file as a buffer, like bytearray. It should be able to do things like a = buffer[100:200] and buffer[33] = 127 without the code having to be aware that it's reading from and writing to a file in the background.

Basically I need the opposite of bytesIO, which uses memory with a file interface. I need a way to use a file with a memory buffer interface. And ideally it doesn't write to the file everytime the data is changed (but it's ok if it does).

The reason I need this functionality is because I use packages that expect data to be in a buffer object, but I only have 4MB of memory available. It's impossible to load the files into memory. So I need an object that acts like a bytearray for example, but reads and writes data directly to a file, not memory.

In my use case I need a micropython module, but a standard python module might work as well. Are there any modules that would do what I need?


Solution

  • Can something like this work for you?

    class Memfile:
    
        def __init__(self, file):
            self.file = file
    
        def __getitem__(self,key):
            if type(key) is int:
                self.file.seek(key)
                return self.file.read(1)
            if type(key) is slice:
                self.file.seek(key.start)
                return self.file.read(key.stop - key.start)
    
        def __setitem__(self, key, val):
            assert(type(val) == bytes or type(val) == bytearray)
            if type(key) is slice:
                assert(key.stop - key.start == len(val))
                self.file.seek(key.start)
                self.file.write(val)
            if type(key) is int:
                assert(len(val) == 1)
                self.file.seek(key)
                self.file.write(val)
    
        def close(self):
            self.file.close()
    
    
    if __name__ == "__main__":
        mf = Memfile(open("data", "r+b")) # Assuming the file 'data' have 10+ bytes
        mf[0:10] = b'\x00'*10
        print(mf[0:10]) # b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
        mf[0:2] = b'\xff\xff'
        print(mf[0:10]) # b'\xff\xff\x00\x00\x00\x00\x00\x00\x00\x00'
        print(mf[2]) # b'\x00'
        print(mf[1]) # b'\xff'
        mf[0:4] = b'\xde\xad\xbe\xef'
        print(mf[0:4]) # b'\xde\xad\xbe\xef'
        mf.close()
    

    Note that if this solutions fits your needs you will need to do plenty of testing here