I'm trying to build a linear model on my own yield
# Create features
X = np.array([-7.0, -4.0, -1.0, 2.0, 5.0, 8.0, 11.0, 14.0])
# Create labels
y = np.array([3.0, 6.0, 9.0, 12.0, 15.0, 18.0, 21.0, 24.0])
model = tf.keras.Sequential([
tf.keras.layers.Dense(50, activation = "elu", input_shape = [1]),
tf.keras.layers.Dense(1)
])
model.compile(loss = "mae",
optimizer = tf.keras.optimizers.Adam(learning_rate = 0.01),
metrics = ["mae"])
model.fit(X, y, epochs = 150)
When I train with the above X and y data, the loss value starts from a normal value.
experience salary
0 0 2250
1 1 2750
2 5 8000
3 8 9000
4 4 6900
5 15 20000
6 7 8500
7 3 6000
8 2 3500
9 12 15000
10 10 13000
11 14 18000
12 6 7500
13 11 14500
14 12 14900
15 3 5800
16 2 4000
But when I use such a dataset, the initial loss value starts as 800.(same as above model btw)
What could be the reason for this?
Your learning rate is significantly high. You should opt for much lower initial learning rates, such as 0.0001
or 0.00001
.
Otherwise you are using 'linear' activation on the last layer (default one) and the correct loss function and metric. Also note that the default batch_size
in absence of explicit mentioning is 32
.
UPDATING : as determined by the author of the question, underfitting was also fundamental to the problem. Adding multiple more layers helped solved the problem.