Is there a way to fit a simple neural network to an input data which is a tensor and its ouputs which must be just one number?

I have this code in which I am trying to fit a model of a neural network which has just three layers: the input layer, a hidden layer and, at the end, the ouput layer which must have just one neuron for the single ouput. The problem is that when doing the fit I'm always obtaining the same values for the accuracy (null) an the loss (remains constant), and I've tried changing the optimizer from 'sgd' to 'adam' and still anything works as it should be. What would you recommend?

Layer (type)                Output Shape              Param N°   
=================================================================
 data_in (InputLayer)        [(None, 4, 256)]          0         
                                                                 
 dense (Dense)               (None, 4, 124)            31868     
                                                                 
 dense_1 (Dense)             (None, 4, 1)              125       
                                                                 
=================================================================
Total params: 31,993
Trainable params: 31,993
Non-trainable params: 0
_________________________________________________________________
Epoch 1/20
20/20 [==============================] - 8s 350ms/step - loss: 0.3170 - accuracy: 1.7361e-05
Epoch 2/20
20/20 [==============================] - 7s 348ms/step - loss: 0.2009 - accuracy: 6.7817e-08
Epoch 3/20
20/20 [==============================] - 7s 348ms/step - loss: 0.0513 - accuracy: 0.0000e+00
Epoch 4/20
20/20 [==============================] - 7s 348ms/step - loss: 0.0437 - accuracy: 0.0000e+00
Epoch 5/20
20/20 [==============================] - 7s 346ms/step - loss: 0.0430 - accuracy: 0.0000e+00
Epoch 6/20
20/20 [==============================] - 7s 346ms/step - loss: 0.0428 - accuracy: 0.0000e+00
Epoch 7/20
20/20 [==============================] - 7s 345ms/step - loss: 0.0428 - accuracy: 0.0000e+00
Epoch 8/20
20/20 [==============================] - 7s 345ms/step - loss: 0.0430 - accuracy: 0.0000e+00
Epoch 9/20
20/20 [==============================] - 7s 348ms/step - loss: 0.0429 - accuracy: 0.0000e+00
Epoch 10/20
20/20 [==============================] - 7s 348ms/step - loss: 0.0429 - accuracy: 0.0000e+00
Epoch 11/20
20/20 [==============================] - 7s 346ms/step - loss: 0.0429 - accuracy: 0.0000e+00
Epoch 12/20
20/20 [==============================] - 7s 344ms/step - loss: 0.0428 - accuracy: 0.0000e+00
Epoch 13/20
20/20 [==============================] - 7s 348ms/step - loss: 0.0428 - accuracy: 0.0000e+00
Epoch 14/20
20/20 [==============================] - 7s 345ms/step - loss: 0.0433 - accuracy: 0.0000e+00
Epoch 15/20
20/20 [==============================] - 7s 345ms/step - loss: 0.0430 - accuracy: 0.0000e+00
Epoch 16/20
20/20 [==============================] - 7s 347ms/step - loss: 0.0432 - accuracy: 0.0000e+00
Epoch 17/20
20/20 [==============================] - 7s 346ms/step - loss: 0.0429 - accuracy: 0.0000e+00
Epoch 18/20
20/20 [==============================] - 7s 347ms/step - loss: 0.0430 - accuracy: 0.0000e+00
Epoch 19/20
20/20 [==============================] - 7s 348ms/step - loss: 0.0428 - accuracy: 0.0000e+00
Epoch 20/20
20/20 [==============================] - 7s 348ms/step - loss: 0.0428 - accuracy: 0.0000e+00
*TEST*
1800/1800 [==============================] - 3s 2ms/step - loss: 0.0449 - accuracy: 0.0000e+00
accuracy:  0%

My input_shape is (4, 256) and my array of training data has shape (57600, 4, 256), meaning I have 57600 samples of shape (4,256). I also have my training labels array (the values I should obtain with the data), having shape (57600,). Finally, the library I am using is TENSORFLOW.

My code is the next one

from keras.layers import Input, Dense, concatenate, Conv2D, MaxPooling2D, Flatten
from keras.models import Model
from tensorflow import keras
from sklearn.preprocessing import MinMaxScaler


div_n = 240

#DIVIDING THE DATA WE WANT TO CLASIFFY AND ITS LABELS - It is being used the 
#scaled data

data = np.array([self_mbyy_scaled,
            self_mvyy_scaled,
            self_mtpr_scaled,
            self_mrho_scaled])

labels = self_iout_scaled


print(np.shape(data))
print(np.shape(labels))


#TRAINING SET AND DATA SET
tr_data = []
tr_labels = []
#Here I'm dividing the whole data in half for the nx, nz dimensions. The first half is the training set and the second half is the test set
for j in range(div_n):
    for k in range(div_n):
        tr_data.append([data[0][j,:,k], 
                       data[1][j,:,k],
                       data[2][j,:,k],
                       data[3][j,:,k]]) #It puts the magnetic field, velocity, temperature and density values in one row for 240x240=57600 columns
        tr_labels.append(labels[j,k]) #the values from the column of targets
        
tr_data = np.array(tr_data)   
tr_data = tr_data.reshape(div_n*div_n, len(data), self_ny, 1)  
tr_labels = np.array(tr_labels)  

print('\n training data shape')
print(np.shape(tr_data))
print('\n training labels shape')
print(np.shape(tr_labels))

te_data = []
te_labels = []
for j in range(div_n):
    for k in range(div_n):
        te_data.append([data[0][div_n+j,:,div_n+k], 
                       data[1][div_n+j,:,div_n+k],
                       data[2][div_n+j,:,div_n+k],
                       data[3][div_n+j,:,div_n+k]]) #It puts the magnetic field, velocity, temperature and density values in one row for 240x240=57600 columns
        te_labels.append(labels[div_n+j,div_n+k]) #the values from the column of targets

te_data = np.array(te_data)   
te_data = te_data.reshape(div_n*div_n, len(data), self_ny, 1)  
te_labels = np.array(te_labels)  

print('\n test data shape')
print(np.shape(te_data))
print('\n test labels shape')
print(np.shape(te_labels))
print('\n')
    

#NEURAL NETWORK MODEL
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(4, 256, 1)))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(Flatten())
model.add(Dense(64, activation='relu'))
model.add(Dense(1))
model.summary()

model.compile(
    optimizer=keras.optimizers.Adam(0.001),
    loss=keras.losses.MeanSquaredError(),
    metrics=['accuracy'],
)

model.fit(
    tr_data, tr_labels,
    epochs=6,
    validation_data=ds_valid,
)

Solution

Since your data seems to have a spatial dimension 57600, 4, 256 --> (samples, timesteps, features), I would recommend using Conv1D layers instead of Conv2D. Here is a simple working example:

import tensorflow as tf

model = tf.keras.Sequential()
model.add(tf.keras.layers.Conv1D(128, 2, activation='relu', input_shape=(4, 256)))
model.add(tf.keras.layers.Conv1D(64, 2, activation='relu'))
model.add(tf.keras.layers.Conv1D(32, 2, activation='relu'))
model.add(tf.keras.layers.GlobalMaxPool1D())
model.add(tf.keras.layers.Dense(64, activation='relu'))
model.add(tf.keras.layers.Dense(1))
model.summary()

model.compile(
    optimizer=tf.keras.optimizers.Adam(0.001),
    loss=tf.keras.losses.MeanSquaredError(),
    metrics=['mse'],
)
samples = 50
x = tf.random.normal((50, 4, 256))
y = tf.random.normal((50,))
model.fit(x, y, batch_size=10, epochs=6)

And note that you usually do not use the accuracy metric for the tf.keras.losses.MeanSquaredError loss function.