Search code examples
pythonchainer

How to apply Optimizer on Variable in Chainer?


Here is an example in Pytorch:

optimizer = optim.Adam([modifier_var], lr=0.0005)

And here in Tensorflow:

self.train = self.optimizer.minimize(self.loss, var_list=[self.modifier])

But Chainer's optimizers only can use on 'Link', how can I apply Optimizer on Variable in Chainer?


Solution

  • In short, there is no way to directly assign chainer.Variable (even nor chainer.Parameter) to chainer.Optimizer.

    The following is some redundant explanation.

    First, I re-define Variable and Parameter to avoid confusion.

    Variable is (1) torch.Tensor in PyTorch v4, (2) torch.autograd.Variable in PyTorch v3, and (3) chainer.Variable in Chainer v4.
    Variable is an object who holds two tensors; .data and .grad. It is the necessary and sufficient condition, so Variable is not necessarily a learnable parameter, which is a target of the optimizer.

    In both libraries, there is another class Parameter, which is similar but not the same with Variable. Parameter is torch.autograd.Parameter in Pytorch and chainer.Parameter in Chainer.
    Parameter must be a learnable parameter and should be optimized.

    Therefore, there should be no case to register Variable (not Parameter) to Optimizer (although PyTorch allows to register Variable to Optimizer: this is just for backward compatibility).

    Second, in PyTorch torch.nn.Optimizer directly optimizes Parameter, but in Chainer chainer.Optimizer DOES NOT optimize Parameter: instead, chainer.UpdateRule does. The Optimizer just registers UpdateRules to Parameters in a Link.

    Therefore, it is only natural that chainer.Optimizer does not receive Parameter as its arguments, because it is just a "delivery-man" of UpdateRule.

    If you want to attach different UpdateRule for each Parameter, you should directly create an instance of UpdateRule subclass, and attach it to the Parameter.