deep-learning conv-neural-network batch-normalization

Why is it that when viewing the architecture in Netron, the normalization layer that goes right after the convolutional layer is not shown?

I test some changes on a Convolutional neural network architecture. I tried to add BatchNorm layer right after conv layer and than add activation layer. Then I swapped activation layer with BatchNorm layer.

# here example of conv-> b_norm -> activation
...
nn.Conv2d(out_, out_, kernel_size=(1, 3), padding=(0, 1), bias=False),
nn.BatchNorm2d(out_),
nn.LeakyReLU(),
...

# here example of conv-> activation -> b_norm 
...
nn.Conv2d(out_, out_, kernel_size=(1, 3), padding=(0, 1), bias=False),
nn.LeakyReLU(),
nn.BatchNorm2d(out_),
...

I have noticed that in Netron (app for visualization NN architectures) there is NO batch_norm in architecture with b_norm right after conv, but there is in the other one with b_norm after activation. So my question is: Does the normalization layer after convolution have any special meaning or is there something wrong with the netron?

Solution

The fact that you cannot see Batch Norm when it follows convolution operation has to do with Batch Norm Folding. The convolution operation followed by Batch Norm can be replaced with just one convolution only with different weights.

I assume, for Netron visualization you first convert to ONNX format.

In PyTorch, Batch Norm Folding optimization is performed by default by torch.onnx.export, where eval mode is default one. You can disable this behavior by converting to ONNX in train mode:

torch.onnx.export(..., training=TrainingMode.TRAINING,...)

Then you should be able to see all Batch Norm layers in your graph.