I have a lasagane code. I want to create the same network using caffe. I could convert the network. But i need help with the hyperparameters in lasagne. The hyperparameters in lasagne look like:
lr = 1e-2
weight_decay = 1e-5
prediction = lasagne.layers.get_output(net['out'])
loss = T.mean(lasagne.objectives.squared_error(prediction, target_var))
weightsl2 = lasagne.regularization.regularize_network_params(net['out'], lasagne.regularization.l2)
loss += weight_decay * weightsl2
How do i perform the L2 regularization part in caffe? Do I have to add any layer for regularization after each convolution/inner-product layer? Relevant parts from my solver.prototxt is as below:
base_lr: 0.01
lr_policy: "fixed"
weight_decay: 0.00001
regularization_type: "L2"
stepsize: 300
gamma: 0.1
max_iter: 2000
momentum: 0.9
also posted in http://datascience.stackexchange.com. Waiting for answers.
It seems like you already got it right.
The weight_decay
meta-parameter combined with regularization_type: "L2"
in your 'solver.prototxt'
tell caffe to use L2
regularization with weight_decay = 1e-5
.
One more thing you might want to tweak is how much regularization affect each parameter. You can set this for each parameter blob in the net via
param { decay_mult: 1 }
For example, an "InnerProduct"
layer with bias has two parameters:
layer {
type: "InnerProduct"
name: "fc1"
# bottom and top here
inner_product_param {
bias_term: true
# ... other params
}
param { decay_mult: 1 } # for weights use regularization
param { decay_mult: 0 } # do not regularize the bias
}
By default, decay_mult
is set to 1, that is, all weights of the net are regularized the same. You can change that to regularize more/less specific parameter blobs.