python machine-learning neural-network autoencoder dimensionality-reduction

How to evaluate the autoencoder used for dimensionality reduction

I am using an autoencoder as a dimensionality reduction technique to use the learned representation as the low dimensional features that can be used for further analysis.

The code snippet:

# Note: implementation --> based on keras 
encoding_dim = 32

# Define input layer
X_input = Input(shape=(X_train.shape[1],))
# Define encoder:
encoded = Dense(encoding_dim, activation='relu')(X_input)
# Define decoder:
decoded = Dense(X_train.shape[1], activation='sigmoid')(encoded)
# Create the autoencoder model
AE_model = Model(X_input, decoded)
#Compile the autoencoder model
AE_model.compile(optimizer='adam', loss='mse')
#Extract learned representation
learned_feature = Model(X_input, encoded)

history = AE_model.fit(X_train, X_train, epochs=10, batch_size=32)

I was looking for a way to measure the quality of the learned representation. I found that one way is to measure the reconstruction error. I use the following code to do so:

import math
reconstr_error = AE_model.evaluate(X_train, X_train, verbose=0)
print('The reconstruction error: %.2f MSE (%.2f RMSE)' % (reconstr_error , math.sqrt(reconstr_error )))

I got 0.00 MSE (0.05 RMSE) as the result. Yet, I am not sure whether the code above is correct or not as to measure the reconstruction error ?. Also, if there is an alternative way to do so, could you please let me know.

Solution

For what purpose are you doing the compression? if you have a following classifier model in your project, you can train that model with the normal(not fed into AE) data and see the accuracy or whatever you are measuring. then train the same model but after compressing your data with AE. then if you also get comparabily good results this means you are extracting something useful with autoencoder. epecially if you don't use all of your data for training AE and see how the compression of examples not seen by AE in its training, will effect the accuracy.

In other techniques like PCA for example, the principal components are eigenvectors and corresponding eigenvalues of those eigenvectors are actually quite meaningful, they tell you how much information in the data varies in each direction, it's like variance. but in AE, especially Deep ones, such analysis is not intuitive or at least beyond my knowledge if they exist. but in 1 layer AE maybe you can still do some similar things, actually, 1 layer AE with MSE as the objective is very close to PCA. you can extract those weights in the hidden layer, also apply PCA or eigendecomposition on your data covariance matrix. Then calculate something like Cosine distance between those hidden layer weights and eigenvectors to see if it preserves something meaningful.

I don't know if anything more can be done, but maybe you be able to find some papers to address such things if this is important for you.