I know there already exists a similar question, which has not been answered.
I have a very large numpy array saved in a npz file. I don't want it to be loaded completely (my RAM can't handle it entirely), but just want to load a part of it.
This is how the file was generated:
np.savez_compressed('file_name.npz', xxx)
And this is how I would like to load it:
xxx = np.load('file_name.npz,mmap_mode="r")
Now, to actually access the part of the array I am interested into, I should type
a = xxx['arr_0'][0][0][0]
But though this piece is quite small, python first loads the whole array (I know it because my RAM is filled) and then shows this small part. The same would happen if I directly wrote
xxx = np.load('file_name.npz,mmap_mode="r")['arr_0'][0][0][0]
What am I doing wrong?
mmap_mode
does not work with a npz
file. An npz
is a zip
archive. That is, it contains npy
files, one per key
. You can see this by looking at the npz
file with a OS archive manager tool.
I'm a little surprised that your load
call doesn't raise an error, but looking at the code I see that it dispatches to NpzFile
loader without even looking at the mmap_mode
parameter.
To use mmap
, you'll have to extract arr_0.npy
(again using the OS tool), and use load
on it.