Suppose that we have a dataset with N labeled instances, and each instance is a 2D matrix of 2 X M. That is, each instance has two rows, where each row is a vector of size M (M columns).
I would like to build a NN whose first layer performs a convolution operation, with a kernel with two rows and one column. The idea is to apply this kernel to each column of the input matrix, generating a single value for each column as result. The idea would be generating as output a vector of size M, where each position P of the vector would be generated by a convolution of the two rows in column P. The following picture illustrates the idea.
I don't know how to build this convolutional layer. Do I need a 1D or a 2D convolution in this case?
I would like to build a NN with the following architecture:
Can you help me in building this architecture?
You want to be using a 2D CNN for this purpose. A 1D CNN will only expect 1 spatial dimension but you have 2 spatial dimensions even though you don't have any 'width' to convolve multiple times on.
A 2D CNN expects a 4D (batch, height, width, channels)
. Your kernel would also be 4D accordingly.
Check this code for more details -
import tensorflow as tf
inp = np.array([[[[2.1],[0.8]],[[1.3],[2.4]],[[1.8],[1.3]]]])
kernel = np.array([[[[1.0]],[[2.0]]]])
print('input shape ->',inp.shape)
print('kernel shape ->',kernel.shape)
result = tf.nn.conv2d(x, kernel, strides=(1,1,1,1), padding='VALID')
print('result shape ->',result.shape)
print(result.numpy())
input shape -> (1, 3, 2, 1)
kernel shape -> (1, 2, 1, 1)
result shape -> (1, 3, 1, 1)
[[[[3.6999998]]
[[6.1000004]]
[[4.3999996]]]]