Search code examples
deep-learningcaffepycaffematcaffe

Is it possible to use pretrained model after adding elementwise layers?


I am using a pre-trained model which I want to add Elementwise layer that products the output of two layers: one layer is output of convolution layer 1x1x256x256 and the other is also the output of convolution layer 1x32x256x256. My question is: If we add elementwise layer for multiplying two layers and sending to the next layer, should we train from the scratch because the architecture is modified or still it is possible to use the pretrained model?

Thanks


Solution

  • Indeed making architectural changes puts the learned features at odds.

    However, there's no reason not to use the learned weight for layers below the change -- these layers are not affected by the change, so they can benefit from the initialization.

    As for the rest of the layers, I suppose init from trained weights should not be worse than random, So why not?

    Don't forget to init any new layers with random weights (the default in caffe is zero - and this might cause trouble for learning).