Search code examples
pytorchlinear-regressiongradient-descent

SGD Optimizer Custom Parameters


I am practicing using Pytorch and trying to implement a simple linear model.

I have initialized some random x and y inputs alongside some random parameters 'a' and 'b'.

import torch
torch.manual_seed(42)
x = torch.rand(100,1)
y = 1 + 2 * x + .1 * torch.rand(100, 1)
a = torch.randn(1, requires_grad=True)
b = torch.randn(1, requires_grad=True)

I would like to use gradient descent on a and b and update them as the model trains. I am using an sklearn linear regression model to compare my results to but they do not seem to be in accordance with each other as I would expect.

Here is the code to create the model

import torch.nn as nn
from torch.optim import SGD

#create simple linear model

class LinearModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.linear = nn.Linear(1, 1)

    def forward(self, x):
        return self.linear(x)

#loss function
criterion = torch.nn.MSELoss()

SGDmodel = LinearModel()

I am thinking my problem could possibly be within

sgd = torch.optim.SGD([a, b], lr=0.001, momentum=0.9, weight_decay=0.1) 

Can I simply input my a and b parameters into the torch.optim function and have them update as the model is trained? or do I need to embed them into my model.parameters() in a different way somehow?

Here is the rest of my code if needed.

epochs = 1
inputs = x
targets = y

for epoch in range(epochs):
    
    sgd.zero_grad()
    yhat = SGDmodel(inputs)
    loss = criterion(yhat, targets)
    loss.backward()
    sgd.step()
    
    print(f'Loss: {loss}')

I am using the following to see what a and b are

print(SGDmodel.linear.weight.grad)
print(SGDmodel.linear.bias.grad)
tensor([[-3.1456]])
tensor([-5.4119])

Any help would be appreciated!


Solution

  • Can I simply input my a and b parameters into the torch.optim function and have them update as the model is trained? or do I need to embed them into my model.parameters() in a different way somehow?

    I think I figured it out!

    I added a and b as parameters in my model by implementing torch.nn.Parameter()

    I added them into the model, and called self.weight and self.bias.

    class LinearModel(nn.Module):
        def __init__(self):
            super().__init__()
            self.linear = nn.Linear(1, 1)
            self.weight = torch.nn.Parameter(a) #sets a as the weight in the linear layer
            self.bias = torch.nn.Parameter(b) # sets b as the bias in the linear layer
            
        def forward(self, x):
            return self.linear(x)
    

    Now when I call my optimizer it will update the custom parameters I defined as a and b

    sgd = torch.optim.SGD(SGDmodel.parameters(), lr=0.001, momentum=0.9, weight_decay=0.1)