Search code examples
pythonnumpymachine-learningarray-broadcasting

Shape error with manually made 3D convolutional neural network


I have the following code for convoluting on a 2d dimension:

matrix = np.array([[1,2,3],[4,5,6],[7,8,9],[10,11,12]])
kernel = np.array([[1,-1],[1,-1]])

kr, kc = kernel.shape
mr, mc = matrix.shape

output = np.empty((mc-kc+1, mr-kr+1))

for row in range(mc-kc+1):
    for column in range(mr-kr+1):
        output[row][column] = (matrix[row:row+kr,column:column+kc]*kernel).sum()

print(output)

I would like to try this on a real color image, here's what I tried

#matrix is now 3D
print(matrix.shape)
Out: (340, 360, 3)

#kernel is now 3D
print(kernel.shape)
Out: (34, 36, 3)

kr, kc, kdim = kernel.shape
mr, mc, mdim = matrix.shape

output = np.empty((mc-kc+1, mr-kr+1, 3))

for row in range(mc-kc+1):
    for column in range(mr-kr+1):
        for dim in range(3):
            output[row][column][dim] = (matrix[row:row+kr,column:column+kc]*kernel).sum()

print(output)
Out: ValueError: operands could not be broadcast together with shapes (33,36,3) (34,36,3)

I think there is something wrong with the depth dimension but I still don't see how to fix it. Help?


Solution

  • You've swapped rows and columns in output declaration and in loops over row and column. Try this version:

    # note: [rows, columns, channel]
    output = np.empty((mr - kr + 1, mc - kc + 1, 3))
    
    # note: `row` loops over matrix rows, `column` - over columns
    for row in range(mr - kr + 1):
      for column in range(mc - kc + 1):
        for dim in range(3):
          output[row][column][dim] = (matrix[row:row + kr, column:column + kc] * kernel).sum()
    
    print(output)