Search code examples
deep-learningcaffebackpropagationpycaffe

Weight update of Siamese networks in Caffe


By following this web-site

http://caffe.berkeleyvision.org/gathered/examples/siamese.html

, I can use a Siamese network in Caffe, which shares the weights for each layer.

But, I was wondering about how the Siamese network in Caffe updates their shared weights. To be specific, if we have

input1 -> conv1(shared) -> output1

input2 -> conv1(shared) -> output2 ===> contrastive loss (from output1 and output2),

then, does Caffe just sums up the two gradients for conv1 from the first and second networks?

Thanks for your response in advance.


Solution

  • You are correct, the diffs (gradients) of shared weights (all parameters with the same name) are accumulated. Note that you can not use different learn rate multipliers (lr_mult) for shared weights. Other features like momentum and weight decay should work like expected.