I have simple linear regression model, where the independent variable is years (1970-present). When I center the input data around zero (i.e. subtract the mean from x) then my model runs fine and I get best-fit-line. But if I don't center the data, the model has infinite loss:
model = tf.keras.models.Sequential([
tf.keras.layers.Input(shape=(1,)),
tf.keras.layers.Dense(1)
])
model.compile(
tf.keras.optimizers.SGD(learning_rate=0.001, momentum=0.9),
loss='mse'
)
model_history = model.fit(
x, # If we change this to `x - x.mean()` then we no longer get nan\inf loss
y,
epochs=200
)
Epoch 1/200
6/6 [==============================] - 0s 1ms/step - loss: inf
Epoch 2/200
6/6 [==============================] - 0s 1ms/step - loss: nan
Epoch 3/200
6/6 [==============================] - 0s 1ms/step - loss: nan
Epoch 4/200
6/6 [==============================] - 0s 1ms/step - loss: nan
.
.
.
I would have expected the model to be slower or maybe not as accurate, but why does it completely break altogether?
Regression with ANNs are a bit tricky. You see nan loss values because your gradients have exploded. That's because outputs don't have an upper bound. Also why do you use momentum?
You can try: