I am new to working with numpy.core.memmap objects and am having trouble figuring out how I edit an existing .npy file read into python using numpy.memmap(). For example, following the example from Scipy.org, I can create an object and write to it, but once created, I cannot modify the contents.
from tempfile import mkdtemp
import os.path as path
data = np.arange(12, dtype='float32')
data.resize((3,4))
filename = path.join(mkdtemp(), 'newfile.dat')
fp = np.memmap(filename, dtype='float32', mode='w+', shape=(3,4))
fp[:] = data[:] ### write data to fp array
del fp ### remove fp object
fpc = np.memmap(filename, dtype='float32', mode='c', shape=(3,4)) ### This is writeable in memory
fpc[0,:] = 0
del fpc ### close object
This simply deletes the object from memory, but the object at filename is not modified. I have tried numpy.memmap.flush(fpc) as well, but this doesn't seem to work either.
I understand from reading other posts that one can simply copy the edited .npy file to another location, but this seems like it could become problematic in terms of disk space. Is it correct that you cannot modify an existing .npy file?
Numpy interprets "copy on write" as "write changes to ram, but don't save them to disk" (docs). This is a fairly standard implementation when referring to data that could be shared between threads or processes. It sounds like you confused copy on write with snapshots (which sometimes use similar terminology, but refer to disk writes rather than ram).
If you change mode="c"
to mode="r+"
(or eliminate the mode
keyword as "r+" is the default anyway), this should solve your problem.
Additionally I would like to point out that in most cases it is simpler and more pythonic to use np.save
and np.load
and simply specify the mmap_mode
keyword with the correct mode when loading the file. While technically limiting flexibility, this eliminates the need to specify a few keywords making things a bit more concise.