I have this code in which I am trying to fit a model of a neural network which has just three layers: the input layer, a hidden layer and, at the end, the ouput layer which must have just one neuron for the single ouput. The problem is that when doing the fit I'm always obtaining the same values for the accuracy (null) an the loss (remains constant), and I've tried changing the optimizer from 'sgd' to 'adam' and still anything works as it should be. What would you recommend?
Layer (type) Output Shape Param N°
=================================================================
data_in (InputLayer) [(None, 4, 256)] 0
dense (Dense) (None, 4, 124) 31868
dense_1 (Dense) (None, 4, 1) 125
=================================================================
Total params: 31,993
Trainable params: 31,993
Non-trainable params: 0
_________________________________________________________________
Epoch 1/20
20/20 [==============================] - 8s 350ms/step - loss: 0.3170 - accuracy: 1.7361e-05
Epoch 2/20
20/20 [==============================] - 7s 348ms/step - loss: 0.2009 - accuracy: 6.7817e-08
Epoch 3/20
20/20 [==============================] - 7s 348ms/step - loss: 0.0513 - accuracy: 0.0000e+00
Epoch 4/20
20/20 [==============================] - 7s 348ms/step - loss: 0.0437 - accuracy: 0.0000e+00
Epoch 5/20
20/20 [==============================] - 7s 346ms/step - loss: 0.0430 - accuracy: 0.0000e+00
Epoch 6/20
20/20 [==============================] - 7s 346ms/step - loss: 0.0428 - accuracy: 0.0000e+00
Epoch 7/20
20/20 [==============================] - 7s 345ms/step - loss: 0.0428 - accuracy: 0.0000e+00
Epoch 8/20
20/20 [==============================] - 7s 345ms/step - loss: 0.0430 - accuracy: 0.0000e+00
Epoch 9/20
20/20 [==============================] - 7s 348ms/step - loss: 0.0429 - accuracy: 0.0000e+00
Epoch 10/20
20/20 [==============================] - 7s 348ms/step - loss: 0.0429 - accuracy: 0.0000e+00
Epoch 11/20
20/20 [==============================] - 7s 346ms/step - loss: 0.0429 - accuracy: 0.0000e+00
Epoch 12/20
20/20 [==============================] - 7s 344ms/step - loss: 0.0428 - accuracy: 0.0000e+00
Epoch 13/20
20/20 [==============================] - 7s 348ms/step - loss: 0.0428 - accuracy: 0.0000e+00
Epoch 14/20
20/20 [==============================] - 7s 345ms/step - loss: 0.0433 - accuracy: 0.0000e+00
Epoch 15/20
20/20 [==============================] - 7s 345ms/step - loss: 0.0430 - accuracy: 0.0000e+00
Epoch 16/20
20/20 [==============================] - 7s 347ms/step - loss: 0.0432 - accuracy: 0.0000e+00
Epoch 17/20
20/20 [==============================] - 7s 346ms/step - loss: 0.0429 - accuracy: 0.0000e+00
Epoch 18/20
20/20 [==============================] - 7s 347ms/step - loss: 0.0430 - accuracy: 0.0000e+00
Epoch 19/20
20/20 [==============================] - 7s 348ms/step - loss: 0.0428 - accuracy: 0.0000e+00
Epoch 20/20
20/20 [==============================] - 7s 348ms/step - loss: 0.0428 - accuracy: 0.0000e+00
*TEST*
1800/1800 [==============================] - 3s 2ms/step - loss: 0.0449 - accuracy: 0.0000e+00
accuracy: 0%
My input_shape is (4, 256) and my array of training data has shape (57600, 4, 256), meaning I have 57600 samples of shape (4,256). I also have my training labels array (the values I should obtain with the data), having shape (57600,). Finally, the library I am using is TENSORFLOW.
My code is the next one
from keras.layers import Input, Dense, concatenate, Conv2D, MaxPooling2D, Flatten
from keras.models import Model
from tensorflow import keras
from sklearn.preprocessing import MinMaxScaler
div_n = 240
#DIVIDING THE DATA WE WANT TO CLASIFFY AND ITS LABELS - It is being used the
#scaled data
data = np.array([self_mbyy_scaled,
self_mvyy_scaled,
self_mtpr_scaled,
self_mrho_scaled])
labels = self_iout_scaled
print(np.shape(data))
print(np.shape(labels))
#TRAINING SET AND DATA SET
tr_data = []
tr_labels = []
#Here I'm dividing the whole data in half for the nx, nz dimensions. The first half is the training set and the second half is the test set
for j in range(div_n):
for k in range(div_n):
tr_data.append([data[0][j,:,k],
data[1][j,:,k],
data[2][j,:,k],
data[3][j,:,k]]) #It puts the magnetic field, velocity, temperature and density values in one row for 240x240=57600 columns
tr_labels.append(labels[j,k]) #the values from the column of targets
tr_data = np.array(tr_data)
tr_data = tr_data.reshape(div_n*div_n, len(data), self_ny, 1)
tr_labels = np.array(tr_labels)
print('\n training data shape')
print(np.shape(tr_data))
print('\n training labels shape')
print(np.shape(tr_labels))
te_data = []
te_labels = []
for j in range(div_n):
for k in range(div_n):
te_data.append([data[0][div_n+j,:,div_n+k],
data[1][div_n+j,:,div_n+k],
data[2][div_n+j,:,div_n+k],
data[3][div_n+j,:,div_n+k]]) #It puts the magnetic field, velocity, temperature and density values in one row for 240x240=57600 columns
te_labels.append(labels[div_n+j,div_n+k]) #the values from the column of targets
te_data = np.array(te_data)
te_data = te_data.reshape(div_n*div_n, len(data), self_ny, 1)
te_labels = np.array(te_labels)
print('\n test data shape')
print(np.shape(te_data))
print('\n test labels shape')
print(np.shape(te_labels))
print('\n')
#NEURAL NETWORK MODEL
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(4, 256, 1)))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(Flatten())
model.add(Dense(64, activation='relu'))
model.add(Dense(1))
model.summary()
model.compile(
optimizer=keras.optimizers.Adam(0.001),
loss=keras.losses.MeanSquaredError(),
metrics=['accuracy'],
)
model.fit(
tr_data, tr_labels,
epochs=6,
validation_data=ds_valid,
)
Since your data seems to have a spatial dimension 57600, 4, 256
--> (samples, timesteps, features)
, I would recommend using Conv1D
layers instead of Conv2D
. Here is a simple working example:
import tensorflow as tf
model = tf.keras.Sequential()
model.add(tf.keras.layers.Conv1D(128, 2, activation='relu', input_shape=(4, 256)))
model.add(tf.keras.layers.Conv1D(64, 2, activation='relu'))
model.add(tf.keras.layers.Conv1D(32, 2, activation='relu'))
model.add(tf.keras.layers.GlobalMaxPool1D())
model.add(tf.keras.layers.Dense(64, activation='relu'))
model.add(tf.keras.layers.Dense(1))
model.summary()
model.compile(
optimizer=tf.keras.optimizers.Adam(0.001),
loss=tf.keras.losses.MeanSquaredError(),
metrics=['mse'],
)
samples = 50
x = tf.random.normal((50, 4, 256))
y = tf.random.normal((50,))
model.fit(x, y, batch_size=10, epochs=6)
And note that you usually do not use the accuracy
metric for the tf.keras.losses.MeanSquaredError
loss function.