python tensorflow keras neural-network conv-neural-network

Keras: Should I use an 1D or a 2D convolutional layer in this case?

Suppose that we have a dataset with N labeled instances, and each instance is a 2D matrix of 2 X M. That is, each instance has two rows, where each row is a vector of size M (M columns).

I would like to build a NN whose first layer performs a convolution operation, with a kernel with two rows and one column. The idea is to apply this kernel to each column of the input matrix, generating a single value for each column as result. The idea would be generating as output a vector of size M, where each position P of the vector would be generated by a convolution of the two rows in column P. The following picture illustrates the idea.

I don't know how to build this convolutional layer. Do I need a 1D or a 2D convolution in this case?

I would like to build a NN with the following architecture:

Convolutional layer with inputs of 2 X M and outputs of M. I would like to apply k kernels (producing k vectors of size M)
Dense layer with 500 neurons and relu activation.
Dropout of 0.2
Dense layer with 2 neurons and softmax activation.

Can you help me in building this architecture?

Solution

You want to be using a 2D CNN for this purpose. A 1D CNN will only expect 1 spatial dimension but you have 2 spatial dimensions even though you don't have any 'width' to convolve multiple times on.

A 2D CNN expects a 4D (batch, height, width, channels). Your kernel would also be 4D accordingly.

Check this code for more details -

import tensorflow as tf

inp = np.array([[[[2.1],[0.8]],[[1.3],[2.4]],[[1.8],[1.3]]]])

kernel = np.array([[[[1.0]],[[2.0]]]])

print('input shape ->',inp.shape)
print('kernel shape ->',kernel.shape)

result = tf.nn.conv2d(x, kernel, strides=(1,1,1,1), padding='VALID')

print('result shape ->',result.shape)
print(result.numpy())

input shape -> (1, 3, 2, 1)
kernel shape -> (1, 2, 1, 1)
result shape -> (1, 3, 1, 1)

[[[[3.6999998]]

  [[6.1000004]]

  [[4.3999996]]]]