Search code examples
machine-learningpytorchneural-networkconv-neural-network

How to properly use AdaptiveAvgPool for global pooling?


I am trying to convert a model which uses Flatten/Linear as the final layer to use global pooling with AdapativeAvgPool1d/Linear. The output dimensions of the Linear layer after the global pooling are messing up the training epochs. I get the following error:

ValueError: operands could not be broadcast together with shapes (64,4) (64,) 

Model with Flatten-->Linear (works)

conv1d --> relu --> maxpool1d --> Flatten --> Linear:

model = nn.Sequential(
            nn.Conv1d(in_channels=1, out_channels=32, kernel_size=128, stride=16, padding=1),
            nn.ReLU(),
            nn.MaxPool1d(kernel_size=2, stride=2),
            nn.Flatten(),
            nn.LazyLinear(n_classes)
    )
==========================================================================================
Layer (type:depth-idx)                   Output Shape              Param #
==========================================================================================
Sequential                               --                        --
├─Conv1d: 1-1                            [64, 32, 505]             4,128
├─ReLU: 1-2                              [64, 32, 505]             --
├─MaxPool1d: 1-3                         [64, 32, 252]             --
├─Flatten: 1-4                           [64, 8064]                --
├─Linear: 1-5                            [64, 4]                   32,260
==========================================================================================

Model with AdaptiveAvgPool1d-->Linear (output dimension wrong)

I want the output of this implementation to match that of the previous one, where the output shape coming out of the Linear layer is [64,4]

    model = nn.Sequential(
            nn.Conv1d(in_channels=1, out_channels=32, kernel_size=128, stride=16, padding=1),
            nn.ReLU(),
            nn.MaxPool1d(kernel_size=2, stride=2),
            nn.AdaptiveAvgPool1d(1),
            nn.LazyLinear(n_classes)
    )    
==========================================================================================
Layer (type:depth-idx)                   Output Shape              Param #
==========================================================================================
Sequential                               --                        --
├─Conv1d: 1-1                            [64, 32, 505]             4,128
├─ReLU: 1-2                              [64, 32, 505]             --
├─MaxPool1d: 1-3                         [64, 32, 252]             --
├─AdaptiveAvgPool1d: 1-4                 [64, 32, 1]               --
├─Linear: 1-5                            [64, 32, 4]               8
==========================================================================================



Solution

  • You can't replace nn.Flatten with nn.AdaptiveAvgPool1d because they don't do the same thing. You still need to add nn.Flatten() after nn.AdaptiveAvgPool1d to have the same output shape.