I have 5 large size numpy array, I want to merge them into one numpy array. Using np.concatenate doesn't help because MemoryError:Unable to allocate ... so i decide to use np.memmap. The shape of my arrays as follow :
#print(arrayA.shape) (29097, 280, 212, 3)
#print(arrayB.shape) (16058, 280, 212, 3)
#print(arrayC.shape) (15412, 280, 212, 3)
#print(arrayD.shape) (21634, 280, 212, 3)
#print(arrayF.shape) (9477 , 280, 212, 3)
my Code:
import glob
import numpy as np
npfiles= glob.glob("D:/mycode/*.npy")
npfiles.sort()
#print(npfiles)
# create a memory-mapped array
pred = np.memmap('memm4', dtype='uint8', mode='w+', shape=(91678,280,212,3))
print(pred.shape)
for i,npfile in enumerate(npfiles):
pred[i,:,:,:]=np.load(npfile)
np.save('D:/mycode/pred.npy',pred)
but it shows me this problem "cann't broadcast input array from shape (29097,280,212,3) into shape (280,212,3) could some one help me and thanks
Currently you are putting a 3 dimension tensor into a 4 dimension 1
the i
variable contains the index of the file from 0 4. Hence pred[i,:,:,:]
has only three dimension
but you need to indicate where the array is going to be stored in memory.
last_index = 0
for npfile in npfiles:
temporary_array = np.load(npfile)
pred[last_index:last_index+ len(temporary_array),:,:,:] = temporary_array
last_index += len(temporary_array)
you might also want to try something like hdf5 / that can allow you to store large arrays easily.