I am new to Keras, and I am trying to use autoencoder in Keras for denoising purposes, but I do not know why my model loss increases rapidly! I applied autoencoder on this data set:
https://archive.ics.uci.edu/ml/datasets/Parkinson%27s+Disease+Classification#
So, we have 756 instances with 753 features. (eg. x.shape=(756,753))
This is what I have done so far:
# This is the size of our encoded representations:
encoding_dim = 64
# This is the input data:
input = keras.Input(shape=(x.shape[1],))
# "encoded" is the encoded representation of the input
encoded = layers.Dense(encoding_dim, activation = 'relu')(input)
# "decoded" is the lossy reconstruction of the input
decoded = layers.Dense(x.shape[1], activation = 'sigmoid')(encoded)
# "decoded" is the lossy reconstruction of the input
autoencoder = keras.Model(input, decoded)
autoencoder.compile(optimizer = 'adam', loss = 'binary_crossentropy')
autoencoder.fit(x, x, epochs = 20, batch_size = 10, shuffle = True, validation_split = 0.2)
But the results are disappointing:
Epoch 1/20
61/61 [==============================] - 1s 4ms/step - loss: -0.1663 - val_loss: -1.5703
Epoch 2/20
61/61 [==============================] - 0s 2ms/step - loss: -5.7013 - val_loss: -10.0048
Epoch 3/20
61/61 [==============================] - 0s 3ms/step - loss: -20.5371 - val_loss: -27.9583
Epoch 4/20
61/61 [==============================] - 0s 2ms/step - loss: -46.5077 - val_loss: -54.0411
Epoch 5/20
61/61 [==============================] - 0s 3ms/step - loss: -83.1050 - val_loss: -90.6973
Epoch 6/20
61/61 [==============================] - 0s 3ms/step - loss: -130.1922 - val_loss: -135.2853
Epoch 7/20
61/61 [==============================] - 0s 3ms/step - loss: -186.8624 - val_loss: -188.3201
Epoch 8/20
61/61 [==============================] - 0s 3ms/step - loss: -252.7997 - val_loss: -250.6024
Epoch 9/20
61/61 [==============================] - 0s 2ms/step - loss: -328.5535 - val_loss: -317.7751
Epoch 10/20
61/61 [==============================] - 0s 2ms/step - loss: -413.2261 - val_loss: -396.6747
Epoch 11/20
61/61 [==============================] - 0s 3ms/step - loss: -508.1084 - val_loss: -479.6847
Epoch 12/20
61/61 [==============================] - 0s 2ms/step - loss: -610.1725 - val_loss: -573.7590
Epoch 13/20
61/61 [==============================] - 0s 2ms/step - loss: -721.8989 - val_loss: -671.3677
Epoch 14/20
61/61 [==============================] - 0s 3ms/step - loss: -840.6516 - val_loss: -780.9920
Epoch 15/20
61/61 [==============================] - 0s 3ms/step - loss: -970.8052 - val_loss: -894.2467
Epoch 16/20
61/61 [==============================] - 0s 3ms/step - loss: -1107.9106 - val_loss: -1015.4778
Epoch 17/20
61/61 [==============================] - 0s 2ms/step - loss: -1252.6410 - val_loss: -1147.4821
Epoch 18/20
61/61 [==============================] - 0s 2ms/step - loss: -1406.9744 - val_loss: -1276.9229
Epoch 19/20
61/61 [==============================] - 0s 2ms/step - loss: -1567.7247 - val_loss: -1421.1270
Epoch 20/20
61/61 [==============================] - 0s 2ms/step - loss: -1734.9993 - val_loss: -1569.7350
How can I improve the results?
I would appreciate any help. Thank you.
Source: https://blog.keras.io/building-autoencoders-in-keras.html
The main problem is not related to the parameters that you have used or the model structure but merely coming from the data you use. In the basic tutorials, the authors like to use perfectly pre-processed data to avoid unnecessary steps. In your case, you have possibly avoid the id and class columns leaving you 753 features. On the other hand, I presume that you have standardized your data without any further exploratory analysis and forward to the autoencoder. The quick fix to solve your negative loss which should not make sense with binary crossentropy is to normalize the data.
I used following code to normalize your data;
df = pd.read_csv('pd_speech_features.csv', header=1)
x = df.iloc[:,1:-1].apply(lambda x: (x-x.min())/ (x.max() - x.min()), axis=0)
The first 20 epoch results from your model after normalization
Epoch 1/20
61/61 [==============================] - 1s 9ms/step - loss: 0.4791 - val_loss: 0.4163
Epoch 2/20
61/61 [==============================] - 0s 6ms/step - loss: 0.4154 - val_loss: 0.4102
Epoch 3/20
61/61 [==============================] - 0s 6ms/step - loss: 0.4090 - val_loss: 0.4052
Epoch 4/20
61/61 [==============================] - 0s 6ms/step - loss: 0.4049 - val_loss: 0.4025
Epoch 5/20
61/61 [==============================] - 0s 7ms/step - loss: 0.4017 - val_loss: 0.4002
Epoch 6/20
61/61 [==============================] - 0s 8ms/step - loss: 0.3993 - val_loss: 0.3985
Epoch 7/20
61/61 [==============================] - 1s 9ms/step - loss: 0.3974 - val_loss: 0.3972
Epoch 8/20
61/61 [==============================] - 1s 13ms/step - loss: 0.3959 - val_loss: 0.3961
Epoch 9/20
61/61 [==============================] - 0s 8ms/step - loss: 0.3946 - val_loss: 0.3950
Epoch 10/20
61/61 [==============================] - 0s 6ms/step - loss: 0.3935 - val_loss: 0.3942
Epoch 11/20
61/61 [==============================] - 0s 7ms/step - loss: 0.3926 - val_loss: 0.3934
Epoch 12/20
61/61 [==============================] - 0s 7ms/step - loss: 0.3917 - val_loss: 0.3928
Epoch 13/20
61/61 [==============================] - 1s 9ms/step - loss: 0.3909 - val_loss: 0.3924
Epoch 14/20
61/61 [==============================] - 0s 4ms/step - loss: 0.3902 - val_loss: 0.3918
Epoch 15/20
61/61 [==============================] - 0s 3ms/step - loss: 0.3895 - val_loss: 0.3913
Epoch 16/20
61/61 [==============================] - 0s 3ms/step - loss: 0.3889 - val_loss: 0.3908
Epoch 17/20
61/61 [==============================] - 0s 4ms/step - loss: 0.3885 - val_loss: 0.3905
Epoch 18/20
61/61 [==============================] - 0s 4ms/step - loss: 0.3879 - val_loss: 0.3903
Epoch 19/20
61/61 [==============================] - 0s 4ms/step - loss: 0.3874 - val_loss: 0.3895
Epoch 20/20
61/61 [==============================] - 0s 4ms/step - loss: 0.3870 - val_loss: 0.3892