I have data with the shape: (512, 20, 32)
and I want to use AvgPool
and get the shape of: (512, 10, 32).
I have tried with no success:
pool = nn.AvgPool1d(kernel_size=2, stride=2)
data = torch.rand(512, 20, 32)
out = pool(data)
print(out.shape)
output:
torch.size([512, 20, 16])
How can I run pool on the horizontal data ?
In pytorch, the dimensions of tensors are batch
-channel
-length
. The pooling layer operates on the length
dimension (the last one).
There are two workarounds I can think of:
You can change the order of dimension prior to pooling, using transpose
:
pool = nn.AvgPool1d(kernel_size=2, stride=2)
data = torch.rand(512, 20, 32)
data_permuted = data.transpose(1, 2)
out_permuted = pool(data_permuted)
out = out_permuted.transpose(1, 2)
print(out.shape)
Output:
torch.Size([512, 10, 32])
Adding an additional "empty" channel dimension will turn your 3D input tensor into a 4D one, allowing nn.AvgPool2d
to operate on the height
dimension. This adding and removing of singleton dimensions can be done using unsqueeze
and squeeze
:
pool = nn.AvgPool2d(kernel_size=(2, 1), stride=(2, 1)) # pool only along height dimension
data = torch.rand(512, 20, 32)
out = pool(data.unsqueeze(dim=1)).squeeze(dim=1)
print(out.shape)
Output:
torch.Size([512, 10, 32])
I believe the second option is more intuitive and might be more efficient than the first one.