Search code examples
pythonmachine-learningtensorflowconv-neural-networkbatch-normalization

Where to apply batch normalization on standard CNNs


I have the following architecture:

Conv1
Relu1
Pooling1
Conv2
Relu2
Pooling3
FullyConnect1
FullyConnect2

My question is, where do I apply batch normalization? And what would be the best function to do this in TensorFlow?


Solution

  • The original batch-norm paper prescribes using the batch-norm before ReLU activation. But there is evidence that it's probably better to use batchnorm after the activation. Here's a comment on Keras GitHub by Francois Chollet:

    ... I can guarantee that recent code written by Christian [Szegedy] applies relu before BN. It is still occasionally a topic of debate, though.

    To your second question: in tensorflow, you can use a high-level tf.layers.batch_normalization function, or a low-level tf.nn.batch_normalization.