Search code examples
pytorchparametersshared

How to effectively share parameters between two Pytorch models


I want to enforce symmetry in my model like f(x,y) = f(y,x) and would like to achieve it by defining two models model1(x,y) and model2(y,x) but let them have same architecture and parameters. I could share their parameters by

for name, param1 in model1.named_parameters():
    if name in model2.state_dict() and param1.shape == model2.state_dict()[name].shape:
      model2.state_dict()[name].data = param1.data

Does it make sense to share parameters this way? Do I have to define two optimizers for each model? Does

optimizer = optim.Adam(list(model1.parameters()) + list(model2.parameters()))  # Combine parameters

make sense since the models share parameters, assuming one of my losses is

criterion(model1(x,y), model2(y,x))

I would appreciate a minimal working example supposing my models have dynamic layer configurations.


Solution

  • What is not clear in your question is whether you want symmetry on one model ("enforce symmetry in my model") or between two (implied when you introduce model1 and model2).

    If you want to enforce symmetry on a single model, you can minimize some distance between model(x,y) and model(y,x)? In that case you only require one model, and one optimizer and enforce that property on model, and not on two distinct models.

    Besides, I don't clearly see what you would get by having model1 and model2 distinct: model1 would model2 be "symmetric"... model2 defined as x,y: model1(y,x) would suffice too no? Maybe there is more to it and you indeed want two models?