I created a convolutional-autoencoder
this way:
input_dim = Input((1, 200, 4))
x = Conv2D(64, (1,3), activation='relu', padding='same')(input_dim)
x = MaxPooling2D((1,2), padding='same')(x)
x = Conv2D(32, (1,3), activation='relu', padding='same')(x)
x = MaxPooling2D((1,2), padding='same')(x)
x = Conv2D(32, (1,3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((1,2), padding='same')(x)
x = Conv2D(32, (1,3), activation='relu', padding='same')(encoded)
x = UpSampling2D((1,2))(x)
x = Conv2D(32, (1,3), activation='relu', padding='same')(x)
x = UpSampling2D((1,2))(x)
x = Conv2D(64, (1,3), activation='relu')(x)
x = UpSampling2D((1,2))(x)
decoded = Conv2D(4, (1,3), activation='sigmoid', padding='same')(x)
autoencoder = Model(input_layer, decoded)
autoencoder.compile(optimizer='adam', loss='mae',
But when I try fitting the model with the last activation of the decoder being sigmoid
as above, the model loss decreases slightly (and remain unchanged at later epochs) so also the mean_square_error
. (using default Adam
autoencoder.fit(train, train, epochs=100, batch_size=256, shuffle=True,
validation_data=(test, test), callbacks=callbacks_list)
Epoch 1/100
97/98 [============================>.] - ETA: 0s - loss: 12.3690 - mean_squared_error: 2090.8232
Epoch 00001: loss improved from inf to 12.36328, saving model to weights.best.hdf5
98/98 [==============================] - 6s 65ms/step - loss: 12.3633 - mean_squared_error: 2089.3044 - val_loss: 12.1375 - val_mean_squared_error: 2029.4445
Epoch 2/100
97/98 [============================>.] - ETA: 0s - loss: 12.3444 - mean_squared_error: 2089.8032
Epoch 00002: loss improved from 12.36328 to 12.34172, saving model to weights.best.hdf5
98/98 [==============================] - 6s 64ms/step - loss: 12.3417 - mean_squared_error: 2089.1536 - val_loss: 12.1354 - val_mean_squared_error: 2029.4530
Epoch 3/100
97/98 [============================>.] - ETA: 0s - loss: 12.3461 - mean_squared_error: 2090.5886
Epoch 00003: loss improved from 12.34172 to 12.34068, saving model to weights.best.hdf5
98/98 [==============================] - 6s 63ms/step - loss: 12.3407 - mean_squared_error: 2089.1526 - val_loss: 12.1351 - val_mean_squared_error: 2029.4374
Epoch 4/100
97/98 [============================>.] - ETA: 0s - loss: 12.3320 - mean_squared_error: 2087.0349
Epoch 00004: loss improved from 12.34068 to 12.34050, saving model to weights.best.hdf5
98/98 [==============================] - 6s 63ms/step - loss: 12.3405 - mean_squared_error: 2089.1489 - val_loss: 12.1350 - val_mean_squared_error: 2029.4448
But then both loss
and mean_squared_error
decrease quickly when I changed the decoder's last activation to relu
Epoch 1/100
97/98 [============================>.] - ETA: 0s - loss: 9.8283 - mean_squared_error: 1267.3282
Epoch 00001: loss improved from inf to 9.82359, saving model to weights.best.hdf5
98/98 [==============================] - 6s 64ms/step - loss: 9.8236 - mean_squared_error: 1266.0548 - val_loss: 8.4972 - val_mean_squared_error: 971.0208
Epoch 2/100
97/98 [============================>.] - ETA: 0s - loss: 8.1906 - mean_squared_error: 910.6423
Epoch 00002: loss improved from 9.82359 to 8.19058, saving model to weights.best.hdf5
98/98 [==============================] - 6s 62ms/step - loss: 8.1906 - mean_squared_error: 910.5417 - val_loss: 7.6558 - val_mean_squared_error: 811.6011
Epoch 3/100
97/98 [============================>.] - ETA: 0s - loss: 7.3522 - mean_squared_error: 736.2031
Epoch 00003: loss improved from 8.19058 to 7.35255, saving model to weights.best.hdf5
98/98 [==============================] - 6s 61ms/step - loss: 7.3525 - mean_squared_error: 736.2403 - val_loss: 6.8044 - val_mean_squared_error: 650.5342
Epoch 4/100
97/98 [============================>.] - ETA: 0s - loss: 6.6166 - mean_squared_error: 621.1281
Epoch 00004: loss improved from 7.35255 to 6.61435, saving model to weights.best.hdf5
98/98 [==============================] - 6s 61ms/step - loss: 6.6143 - mean_squared_error: 620.6105 - val_loss: 6.2180 - val_mean_squared_error: 572.2390
I want to verify if it is valid to use an-all relu
function in the network architecture. Being novice to deep learning.
You can use relu
function as activation in the final layer.
You can see in the autoencoder example at the official TensorFlow site here.
Use the sigmoid/softmax activation function in the final output layer when you are trying to solve the Classification problems where your labels are class values.