I am programming a PyTorch model which includes the following:
self.basis_mat=torch.nn.Linear(
in_features=self.basis_mat_params[0],
out_features=self.basis_mat_params[1],
bias=self.basis_mat_params[2]).to(device)
Now what I want to do is to dynamically increment the out_features
of self.basis_mat
. However, I also want to preserve any previously trained parameters. In other words, we randomly initialize any new parameters while leaving the rest unchanged, while out_features
is being incremented in the model.
So, what should I do?
I didn't really know what to try... I checked the documentation, but it turns out that in torch.nn.Linear
the parameter out_features
is unchangeable and if I instantiate a new torch.nn.Linear
(which would be quite slow I guess, so even if something similar works I would only use this as a last resort) the parameters are inbuilt, random, and unadjustable.
You can do the following,
from torch import nn
import torch
dim1 = 3
dim2 = 4
dim3 = 5
older_weight_parameters = torch.zeros( dim1, dim2) # replace this with older parameters, older_linear_layer.weight.t().detach()
older_bias_parameters = torch.zeros( dim2) # replace this with older parameters, older_linear_layer.bias.detach()
linear_layer = nn.Linear(dim1, dim2 + dim3)
# This will fill the corresponding parameters with the older parameter values
linear_layer.weight.data[:dim2, :] = older_weight_parameters.t()
linear_layer.bias.data[:dim2] = older_bias_parameters
Then the linear layer will have the following weight and bias parameter values. You can see the values of older parameters (in the examples these are zero values) have replaced the random parameters in the respective places.
linear_layer.weight, linear_layer.bias
(Parameter containing:
tensor([[ 0.0000, 0.0000, 0.0000],
[ 0.0000, 0.0000, 0.0000],
[ 0.0000, 0.0000, 0.0000],
[ 0.0000, 0.0000, 0.0000],
[ 0.3455, -0.3385, 0.4746],
[-0.2067, -0.4950, -0.3668],
[-0.4012, -0.3951, -0.3772],
[-0.1393, 0.3602, -0.0460],
[-0.4641, 0.2152, 0.4031]], requires_grad=True),
Parameter containing:
tensor([ 0.0000, 0.0000, 0.0000, 0.0000, 0.0502, -0.3423, 0.3871, -0.5218,
-0.4322], requires_grad=True)
The weights and bias parameters of the Linear layer are like any other tensor. You can use tensor manipulation on it.