python pytorch conv-neural-network transfer-learning pre-trained-model

Unexpected calculation of number of trainable parameters (Pytorch)

Consider the following code

from torch import nn
from torchsummary import summary
from torchvision import models

model = models.efficientnet_b7(pretrained=True)
model.classifier[-1].out_features = 4   # because i have a 4-class problem; initially the output is 1000 classes
model.classifier = nn.Sequential(*model.classifier, nn.Softmax(dim=1))  # add softmax

# freeze features
for child in model.features:
    for param in child.parameters():
        param.requires_grad = False

When I run

model.classifier

I get the below (expected) output

which as per my calculations implies that the total trainable parameters should be (2560 + 1) * 4 output nodes = 10244 tranable params.

However, when i attempt to calculate the total number of trainable params by

summary(model, (3,128,128))

I get

and by

sum(p.numel() for p in model.parameters() if p.requires_grad)

I get

The 2,561,000, in both cases, comes from (2560 + 1) * 1000 classes.

But, why does it still consider 1000 classes though ?

Solution

Resetting an attribute of an initialized layer does not necessarily re-initialize it with the newly-set attribute. What you need is model.classifier[-1] = nn.Linear(2560, 4).