pytorch neural-network regression gradient-descent

Why i get a straight line from my neural network model, but the actual distribution should be a curve?

I am trying to fit a data set similar to gauss distribution with a simple neural network, but the effect is not good. The fitting result is always straight and no matter how I adjust the learning rate or increase the epoch, it has no effect.

blue is the data points and red is model's output at same X value

I also try to use the same model to fit other functions, and the result is that other curves can be fitted well.

So I don't know whether there are problems in my training process or data or the model is too simple.

here's my code

class Net(torch.nn.Module):
    def __init__(self, n_feature, n_hidden1, n_hidden2, n_output):
        super(Net, self).__init__()
        self.hidden1 = torch.nn.Linear(n_feature, n_hidden1)   
        self.hidden2 = torch.nn.Linear(n_hidden1, n_hidden2)
        self.predict = torch.nn.Linear(n_hidden2, n_output)  

    def forward(self, x):
        x = F.relu(self.hidden1(x))      
        x = F.relu(self.hidden2(x))
        x = self.predict(x)             
        return x

net = Net(n_feature=1, n_hidden1=200,n_hidden2=100, n_output=1) 

optimizer = torch.optim.Adam(net.parameters(), lr=0.00005)
loss_func = torch.nn.MSELoss()

train_x = torch.tensor(X.reshape(-1,1),dtype=torch.float32)
train_y = torch.tensor(Y.reshape(-1,1),dtype=torch.float32)

for t in range(5000):
    prediction = net(train_x)   

    loss = loss_func(prediction, train_y)   

    optimizer.zero_grad()
    loss.backward()   
    optimizer.step()  

    if t % 5 == 0:
        print("Epoch{}, loss:{:.6f}".format(t, loss.data.numpy()))

After hundreds of epochs of training, the loss value will be fixed to a relatively large value and will not decrease any more. No matter I turn up the learning rate, the loss value will not change

Solution

Why the network doesn't train

The primary issue is that your data are not remotely normalized (x ranges from ~3500 to ~6500 and f(x) from 0 to ~25000).

Neural network framework default configurations (the weight initialization and optimizer settings provided by PyTorch / TensorFlow / etc.) are generally set assuming inputs and outputs with a range of around [-1, 1] ([0, 1] works fine, [-2, 2] works fine, but [3500, 6500] will work poorly).

How to fix training

If you rescale your inputs / outputs to be in a range around [-1, 1], your neural network should be much happier.

Example code (I also adjusted the optimizer settings a bit to make it converge faster):

import torch
import torch.nn.functional as F
import matplotlib.pyplot as plt


class Net(torch.nn.Module):
    def __init__(self, n_feature, n_hidden1, n_hidden2, n_output):
        super(Net, self).__init__()
        self.hidden1 = torch.nn.Linear(n_feature, n_hidden1)
        self.hidden2 = torch.nn.Linear(n_hidden1, n_hidden2)
        self.predict = torch.nn.Linear(n_hidden2, n_output)

    def forward(self, x):
        x = F.relu(self.hidden1(x))
        x = F.relu(self.hidden2(x))
        x = self.predict(x)
        return x


net = Net(n_feature=1, n_hidden1=200, n_hidden2=100, n_output=1)

optimizer = torch.optim.Adam(net.parameters(), lr=0.0003)
loss_func = torch.nn.MSELoss()

losses = []

for t in range(500):
    # distribution similar to original data
    train_x = torch.empty(128, 1).uniform_(3500, 6500)
    train_y = (0.01 * (train_x - 5000)) ** 4 + 5 * (
        train_x - 5000
    ).abs() * torch.randn_like(train_x)

    # vvvv this part is important! vvvv
    rescale_data_to_suitable_range = True
    if rescale_data_to_suitable_range:
        train_x = (train_x - 5000) / 1500
        train_y = train_y / 50000
    prediction = net(train_x)

    loss = loss_func(prediction, train_y)

    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    losses.append(loss.item())

    if t % 5 == 0:
        print("Epoch {}, loss: {:.6f}".format(t, loss.item()), end="\r")
    if t % 50 == 0:
        plt.scatter(train_x.flatten(), train_y.flatten(), label="GT")
        plt.scatter(train_x.flatten(), prediction.detach().flatten(), label="Pred")
        plt.gcf().set_size_inches(12, 8)
        plt.legend()
        plt.show()
plt.plot(losses, label="losses")
plt.gcf().set_size_inches(12, 8)
plt.legend()
plt.show()

This modified code trains fine:

But if I set rescale_data_to_suitable_range = False, it has the same problem as in your post: