I've been playing around with text classification in PyTorch and I've encountered a problem with 1 dimensional convolutions.
I've set an embedding layer of dimesions (x, y, z) where: x - denotes the batch size y - denotes the length of a sentence (fixed with padding, so 40 words) z - the dimensionality of pre-trained word embedding (for now 100)
For simplicity sake, let's assume I put in a matrix of (1,40, 100)
However, when to my knowledge once I perform torch.nn.conv1d(*args), The resulting matrix becomes (batch size = 1, word size = 40, feature map size = 98) with kernel size of 3.
Basically, as I understand it convolves around y axis instead of x axis and it turn does not capture the spacial relationship between word embeddings.
Is there any way to change the convolutional layer so it calculates feature maps around different axis?
TL, DR:
Torch conv1d layer behaves this way on embedding layer: enter image description here
But I want it to behave like this
Any help would be much appreciated.
conv1d expects the input's size to be (batch_size, num_channels, length) and there is no way to change that, so you have two possible ways ahead of you, you can either permute
the output of embedding or you can use a conv1d instead of you embedding layer(in_channels = num_words, out_channels=word_embedding_size, and kernel_size=1) which is slower than embedding and not a good idea!
input = torch.randint(0, 10, (batch_size, sentence_length))
embeddings = word_embedding(input) #(batch_size, sentence_length, embedding_size)
embeddings_permuted = embeddings.permute(0, 2, 1) #(batch_size, embedding_size, sentence_length)
conv_out = convolution(embeddings_permuted) #(batch_size, conv_out_channels, changed_sentence_length)
#now you can either use the output as it is or permute it back (based on your upper layers)
#also note that I wrote changed_sentence_length because it is a fucntion of your padding and stride