Currently, I am tuning my model by testing the Kernel size.
I have the following code
:
x = embedding_layer(input_4)
x = Conv1D(FILTERS, KERNEL, activation='relu')(x)
x = Dropout(DROPOUT)(x)
x = Conv1D(FILTERS, KERNEL, activation='relu')(x)
x = Dropout(DROPOUT)(x)
x = Conv1D(FILTERS, KERNEL, activation='relu')(x)
x = Dropout(DROPOUT)(x)
x = Conv1D(FILTERS, KERNEL, activation='relu')(x)
x = Dropout(DROPOUT)(x)
x = Conv1D(FILTERS, KERNEL, activation='relu')(x)
x = Dropout(DROPOUT)(x)
x = MaxPooling1D(3)(x)
x = Conv1D(FILTERS, KERNEL, activation='relu')(x)
x = Dropout(DROPOUT)(x)
x = Conv1D(FILTERS, KERNEL, activation='relu')(x)
x = Dropout(DROPOUT)(x)
x = Conv1D(FILTERS, KERNEL, activation='relu')(x)
x = Dropout(DROPOUT)(x)
x = Conv1D(FILTERS, KERNEL, activation='relu')(x)
x = Dropout(DROPOUT)(x)
x = Conv1D(FILTERS, KERNEL, activation='relu')(x)
x = Dropout(DROPOUT)(x)
x = MaxPooling1D(3)(x)
When the Kernel is 2
or 3
, the network runs fine, but from 4
onwards it runs into an error about the dimensionality. I suspect that it has to do with the stride length. However, the Keras
website (https://keras.io/layers/convolutional/) does not say what the default stride length is.
My question now is: what is default stride length in Keras' Conv1D? And what would be a good stride length for a kernel size of 4
and for a kernel size of 5
?
From Conv1D, the default stride length is 1. Unless you have a concrete justification for another length, a stride length of 1 is usually appropriate.
The error you get is probably because the output dimension of a 1D convolutional layer is:
output_dim = 1 + (input_dim - kernel_size)/stride
And after stacking several 1D convolutional layers, you might be reaching a layer in which the input dimensionality is smaller than the kernel size. This happens because the default value for the argument padding
is 'valid'
, which means that the input is not padded.
If instead you want to preserve the input dimensionality at each convolutional layer, setting padding='same'
results in padding the input such that the output has the same length as the original input.