Search code examples
pythonarraysnumpyslicebroadcast

value error:cann't broadcast input array from shape (29097,280,212,3) into shape (280,212,3)


I have 5 large size numpy array, I want to merge them into one numpy array. Using np.concatenate doesn't help because MemoryError:Unable to allocate ... so i decide to use np.memmap. The shape of my arrays as follow :

#print(arrayA.shape) (29097, 280, 212, 3)
#print(arrayB.shape) (16058, 280, 212, 3)
#print(arrayC.shape) (15412, 280, 212, 3)
#print(arrayD.shape) (21634, 280, 212, 3)
#print(arrayF.shape) (9477 , 280, 212, 3)

my Code:

    import glob
    import numpy as np
    npfiles= glob.glob("D:/mycode/*.npy")
    npfiles.sort()
    #print(npfiles)
    # create a memory-mapped array 
    pred = np.memmap('memm4', dtype='uint8', mode='w+', shape=(91678,280,212,3))
    print(pred.shape)

    for i,npfile in enumerate(npfiles):
         pred[i,:,:,:]=np.load(npfile)
    np.save('D:/mycode/pred.npy',pred)

but it shows me this problem "cann't broadcast input array from shape (29097,280,212,3) into shape (280,212,3) could some one help me and thanks


Solution

  • Currently you are putting a 3 dimension tensor into a 4 dimension 1 the i variable contains the index of the file from 0 4. Hence pred[i,:,:,:] has only three dimension but you need to indicate where the array is going to be stored in memory.

    last_index = 0
    for npfile in npfiles:
        temporary_array = np.load(npfile)
        pred[last_index:last_index+ len(temporary_array),:,:,:] = temporary_array
        last_index += len(temporary_array)
    

    you might also want to try something like hdf5 / that can allow you to store large arrays easily.