Introduction
According to the lasagne docs : "This layer should be inserted between a linear transformation (such as a DenseLayer, or Conv2DLayer) and its nonlinearity. The convenience function batch_norm() modifies an existing layer to insert batch normalization in front of its nonlinearity."
However lasagne also have the utility function :
lasagne.layers.batch_norm
However, due to implementation on my end, i cant use that function.
My Question is : How and Where should i add the BatchNormLayer?
class lasagne.layers.BatchNormLayer(incoming, axes='auto', epsilon=1e-4, alpha=0.1, beta=lasagne.init.Constant(0), gamma=lasagne.init.Constant(1), mean=lasagne.init.Constant(0), inv_std=lasagne.init.Constant(1), **kwargs)
Can i add it after a convolution layer? or should i add after the maxpool? Do i have to manually remove the bias of the layers?
Approach used I have used it like this, only, :
try:
import lasagne
import theano
import theano.tensor as T
input_var = T.tensor4('inputs')
target_var = T.fmatrix('targets')
network = lasagne.layers.InputLayer(shape=(None, 1, height, width), input_var=input_var)
from lasagne.layers import BatchNormLayer
network = BatchNormLayer(network,
axes='auto',
epsilon=1e-4,
alpha=0.1,
beta=lasagne.init.Constant(0),
gamma=lasagne.init.Constant(1),
mean=lasagne.init.Constant(0),
inv_std=lasagne.init.Constant(1))
network = lasagne.layers.Conv2DLayer(
network, num_filters=60, filter_size=(3, 3), stride=1, pad=2,
nonlinearity=lasagne.nonlinearities.rectify,
W=lasagne.init.GlorotUniform())
network = lasagne.layers.Conv2DLayer(
network, num_filters=60, filter_size=(3, 3), stride=1, pad=1,
nonlinearity=lasagne.nonlinearities.rectify,
W=lasagne.init.GlorotUniform())
network = lasagne.layers.MaxPool2DLayer(incoming=network, pool_size=(2, 2), stride=None, pad=(0, 0),
ignore_border=True)
network = lasagne.layers.DenseLayer(
lasagne.layers.dropout(network, p=0.5),
num_units=32,
nonlinearity=lasagne.nonlinearities.rectify)
network = lasagne.layers.DenseLayer(
lasagne.layers.dropout(network, p=0.5),
num_units=1,
nonlinearity=lasagne.nonlinearities.sigmoid)
return network, input_var, target_var
References:
https://github.com/Lasagne/Lasagne/blob/master/lasagne/layers/normalization.py#L120-L320
http://lasagne.readthedocs.io/en/latest/modules/layers/normalization.html
If not using batch_norm
:
batch_norm
, do remove the bias manually as it's redundant.Please test the code below and let us know if it works for what you are trying to accomplish. If it does not work, you can try adapting the batch_norm code.
import lasagne
import theano
import theano.tensor as T
from lasagne.layers import batch_norm
input_var = T.tensor4('inputs')
target_var = T.fmatrix('targets')
network = lasagne.layers.InputLayer(shape=(None, 1, height, width), input_var=input_var)
network = lasagne.layers.Conv2DLayer(
network, num_filters=60, filter_size=(3, 3), stride=1, pad=2,
nonlinearity=lasagne.nonlinearities.rectify,
W=lasagne.init.GlorotUniform())
network = batch_norm(network)
network = lasagne.layers.Conv2DLayer(
network, num_filters=60, filter_size=(3, 3), stride=1, pad=1,
nonlinearity=lasagne.nonlinearities.rectify,
W=lasagne.init.GlorotUniform())
network = batch_norm(network)
network = lasagne.layers.MaxPool2DLayer(incoming=network, pool_size=(2, 2), stride=None, pad=(0, 0),
ignore_border=True)
network = lasagne.layers.DenseLayer(
lasagne.layers.dropout(network, p=0.5),
num_units=32,
nonlinearity=lasagne.nonlinearities.rectify)
network = batch_norm(network)
network = lasagne.layers.DenseLayer(
lasagne.layers.dropout(network, p=0.5),
num_units=1,
nonlinearity=lasagne.nonlinearities.sigmoid)
network = batch_norm(network)
When getting the params to create the graph for you update method, remember to set trainable to True:
params = lasagne.layers.get_all_params(l_out, trainable=True)
updates = lasagne.updates.adadelta($YOUR_LOSS_HERE, params)`