Search code examples
kerasdeep-learningconv-neural-networkfeature-extractionvgg-net

So many null value features while doing feature extraction using VGG19


I was working on an image recognition problem. After fine-tuned VGG19 and added few layers and trained the model. Upon achieving the best test accuracy, I saved the model, removed the last 6 layers, and extracted the activations of the last fully connected layer using the following code.

ROWS,COLS = 669,1026
input_shape = (ROWS, COLS, 3)

# train_data_dir = '/home/spectrograms/train'
validation_data_dir = '/home/spectrograms/test'
nb_train_samples = 791
nb_validation_samples = 198
# epochs = 200
batch_size = 10

if K.image_data_format() == 'channels_first':
    input_shape = (3, ROWS, COLS)
else:
    input_shape = (ROWS, COLS,3)

test_datagen = ImageDataGenerator(rescale=1. / 255)
validation_generator = test_datagen.flow_from_directory(
    validation_data_dir,
    target_size=(ROWS, COLS),
    batch_size=batch_size,
    class_mode='binary')

model = Model(inputs=model.inputs, outputs=model.layers[-6].output)

predict = model.predict_generator(validation_generator,steps = 10)
print(predict[10])
print(predict[10].shape)

But the output vector has a lot of zeros.

[5.77765644e-01 2.44531885e-01 0.00000000e+00 0.00000000e+00
 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
 2.99371660e-01 8.27999532e-01 2.34099194e-01 2.67183155e-01
 0.00000000e+00 1.95847541e-01 4.49438214e-01 1.00336084e-02
 0.00000000e+00 0.00000000e+00 4.63756740e-01 0.00000000e+00
 0.00000000e+00 1.15372933e-01 0.00000000e+00 0.00000000e+00
 1.13927014e-01 6.74777776e-02 7.49553144e-01 0.00000000e+00
 6.73675537e-02 2.85279214e-01 0.00000000e+00 0.00000000e+00
 1.84553280e-01 4.57495511e-01 0.00000000e+00 0.00000000e+00
 5.35506964e-01 0.00000000e+00 0.00000000e+00 0.00000000e+00
 2.92950690e-01 0.00000000e+00 5.27026653e-01 0.00000000e+00
 0.00000000e+00 0.00000000e+00 3.94881278e-01 0.00000000e+00
 5.37508354e-02 6.67039156e-02 1.16688050e-01 6.52413011e-01
 3.44565332e-01 0.00000000e+00 0.00000000e+00 0.00000000e+00
 0.00000000e+00 0.00000000e+00 1.10359691e-01 3.63592118e-01
 9.89193693e-02 1.15959466e-01 0.00000000e+00 1.57176346e-01
 0.00000000e+00 0.00000000e+00 0.00000000e+00 2.90686011e-01
 0.00000000e+00 6.03572190e-01 1.97682872e-01 1.57113865e-01
 0.00000000e+00 2.84446061e-01 1.26254544e-01 0.00000000e+00
 0.00000000e+00 5.51187336e-01 0.00000000e+00 0.00000000e+00
 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
 1.11384936e-01 1.67153805e-01 2.63090044e-01 0.00000000e+00
 9.35753658e-02 9.16089058e-01 1.90610379e-01 0.00000000e+00
 0.00000000e+00 0.00000000e+00 3.04680824e-01 2.47930676e-01
 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
 1.50975913e-01 3.60320956e-02 0.00000000e+00 3.47187579e-01
 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
 3.01374853e-01 0.00000000e+00 2.38310188e-01 0.00000000e+00
 0.00000000e+00 0.00000000e+00 3.16582739e-01 0.00000000e+00
 0.00000000e+00 0.00000000e+00 0.00000000e+00 8.17666354e-04
 2.30050087e-01 4.66496646e-01 0.00000000e+00 0.00000000e+00
 1.05043598e-01 0.00000000e+00 6.77903090e-03 3.72976154e-01]

Is it normal? Or am I doing something wrong?


Solution

  • It seems okay to me. The positive non-zero values that you've got in your output vector, those are the activated neurons. As the activation used in VGGNet is mostly ReLU (except the output layer), so the output value of any layer is bounded between 0 or some positive value.

    If you may recall, the relu functions is something like-

    def relu(z):
      return max(0, z)
    

    So, as you see it squashes all the negative values to zero and lets only the positive values to pass.

    Hence, It's very natural for all those values to be zero (jargon: In relu terms those are called dead neurons which is a big problem in RNN, and not so much in CNN)