Search code examples
pythontensorflowmachine-learningkeraslinear-regression

Why does my linear regression model fail if I don't center input data?


I have simple linear regression model, where the independent variable is years (1970-present). When I center the input data around zero (i.e. subtract the mean from x) then my model runs fine and I get best-fit-line. But if I don't center the data, the model has infinite loss:

model = tf.keras.models.Sequential([
    tf.keras.layers.Input(shape=(1,)),
    tf.keras.layers.Dense(1)
])

model.compile(
    tf.keras.optimizers.SGD(learning_rate=0.001, momentum=0.9),
    loss='mse'
)

model_history = model.fit(
    x, # If we change this to `x - x.mean()` then we no longer get nan\inf loss
    y,
    epochs=200
)
Epoch 1/200
6/6 [==============================] - 0s 1ms/step - loss: inf     
Epoch 2/200
6/6 [==============================] - 0s 1ms/step - loss: nan
Epoch 3/200
6/6 [==============================] - 0s 1ms/step - loss: nan
Epoch 4/200
6/6 [==============================] - 0s 1ms/step - loss: nan
.
.
.

I would have expected the model to be slower or maybe not as accurate, but why does it completely break altogether?

Edit: this is what the data looks like


Solution

  • Regression with ANNs are a bit tricky. You see nan loss values because your gradients have exploded. That's because outputs don't have an upper bound. Also why do you use momentum?

    You can try:

    • Reducing learning rate
    • Changing to adam optimizer