Search code examples
pytorchneural-networkregressiongradient-descent

Why i get a straight line from my neural network model, but the actual distribution should be a curve?


I am trying to fit a data set similar to gauss distribution with a simple neural network, but the effect is not good. The fitting result is always straight and no matter how I adjust the learning rate or increase the epoch, it has no effect.

blue is the data points and red is model's output at same X value

I also try to use the same model to fit other functions, and the result is that other curves can be fitted well.

So I don't know whether there are problems in my training process or data or the model is too simple.

here's my code

class Net(torch.nn.Module):
    def __init__(self, n_feature, n_hidden1, n_hidden2, n_output):
        super(Net, self).__init__()
        self.hidden1 = torch.nn.Linear(n_feature, n_hidden1)   
        self.hidden2 = torch.nn.Linear(n_hidden1, n_hidden2)
        self.predict = torch.nn.Linear(n_hidden2, n_output)  

    def forward(self, x):
        x = F.relu(self.hidden1(x))      
        x = F.relu(self.hidden2(x))
        x = self.predict(x)             
        return x

net = Net(n_feature=1, n_hidden1=200,n_hidden2=100, n_output=1) 

optimizer = torch.optim.Adam(net.parameters(), lr=0.00005)
loss_func = torch.nn.MSELoss()

train_x = torch.tensor(X.reshape(-1,1),dtype=torch.float32)
train_y = torch.tensor(Y.reshape(-1,1),dtype=torch.float32)

for t in range(5000):
    prediction = net(train_x)   

    loss = loss_func(prediction, train_y)   

    optimizer.zero_grad()
    loss.backward()   
    optimizer.step()  

    if t % 5 == 0:
        print("Epoch{}, loss:{:.6f}".format(t, loss.data.numpy()))

After hundreds of epochs of training, the loss value will be fixed to a relatively large value and will not decrease any more. No matter I turn up the learning rate, the loss value will not change


Solution

  • Why the network doesn't train

    The primary issue is that your data are not remotely normalized (x ranges from ~3500 to ~6500 and f(x) from 0 to ~25000).

    Neural network framework default configurations (the weight initialization and optimizer settings provided by PyTorch / TensorFlow / etc.) are generally set assuming inputs and outputs with a range of around [-1, 1] ([0, 1] works fine, [-2, 2] works fine, but [3500, 6500] will work poorly).

    How to fix training

    If you rescale your inputs / outputs to be in a range around [-1, 1], your neural network should be much happier.

    Example code (I also adjusted the optimizer settings a bit to make it converge faster):

    import torch
    import torch.nn.functional as F
    import matplotlib.pyplot as plt
    
    
    class Net(torch.nn.Module):
        def __init__(self, n_feature, n_hidden1, n_hidden2, n_output):
            super(Net, self).__init__()
            self.hidden1 = torch.nn.Linear(n_feature, n_hidden1)
            self.hidden2 = torch.nn.Linear(n_hidden1, n_hidden2)
            self.predict = torch.nn.Linear(n_hidden2, n_output)
    
        def forward(self, x):
            x = F.relu(self.hidden1(x))
            x = F.relu(self.hidden2(x))
            x = self.predict(x)
            return x
    
    
    net = Net(n_feature=1, n_hidden1=200, n_hidden2=100, n_output=1)
    
    optimizer = torch.optim.Adam(net.parameters(), lr=0.0003)
    loss_func = torch.nn.MSELoss()
    
    losses = []
    
    for t in range(500):
        # distribution similar to original data
        train_x = torch.empty(128, 1).uniform_(3500, 6500)
        train_y = (0.01 * (train_x - 5000)) ** 4 + 5 * (
            train_x - 5000
        ).abs() * torch.randn_like(train_x)
    
        # vvvv this part is important! vvvv
        rescale_data_to_suitable_range = True
        if rescale_data_to_suitable_range:
            train_x = (train_x - 5000) / 1500
            train_y = train_y / 50000
        prediction = net(train_x)
    
        loss = loss_func(prediction, train_y)
    
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        losses.append(loss.item())
    
        if t % 5 == 0:
            print("Epoch {}, loss: {:.6f}".format(t, loss.item()), end="\r")
        if t % 50 == 0:
            plt.scatter(train_x.flatten(), train_y.flatten(), label="GT")
            plt.scatter(train_x.flatten(), prediction.detach().flatten(), label="Pred")
            plt.gcf().set_size_inches(12, 8)
            plt.legend()
            plt.show()
    plt.plot(losses, label="losses")
    plt.gcf().set_size_inches(12, 8)
    plt.legend()
    plt.show()
    

    This modified code trains fine: enter image description here enter image description here

    But if I set rescale_data_to_suitable_range = False, it has the same problem as in your post:

    enter image description here enter image description here