Search code examples
machine-learningneural-networkdeep-learningkeraskeras-layer

Keras VGG16 lower level features extraction


I am pulling lower level features from the VGG16 model included as Keras application. These features are exported as separate outputs of pre-trained input data for an add-on classifier. The conceptual idea was borrowed from Multi-scale recognition with DAG-CNNs

Using the model without the classifier top, features at the highest level are extracted from block_5 pulling layer using Flatten(): block_05 = Flatten(name='block_05')(block5_pool). This gives an output vector with dimension 8192. Flatten(), however does not work on lower pulling layers as the dimensions get too large (memory issues). Instead lower pulling layers (or any other layer) can be extracted using GlobalAveragePooling2D(): block_04 = GlobalAveragePooling2D(name='block_04')(block4_pool). The problem with this approach is however that the dimension of the feature vector reduces rapidly the lower you go: block_4 (512), block_3 (256), block_2 (128), block_1 (64).

What would be a suitable layer or set-up to retain more feature data from deeper layers?

For info, the output of the model looks like this, the add-on classifier has a corresponding number of inputs.

 # Create model, output data in reverse order from top to bottom
model = Model(input=img_input, output=[block_05,    # ch_00, layer 17, dim 8192
                                       block_04,    # ch_01, layer 13, dim 512
                                       block_03,    # ch_02, layer 9, dim 256
                                       block_02,    # ch_03, layer 5, dim 128
                                       block_01])   # ch_04, layer 2, dim 64

Solution

  • The memory error you mentioned comes from flattening a huge array which makes the number of units extremely large. What you actually need to do is to downsample your input in a smart way. I will present you some way on how to do this:

    1. MaxPooling: by simple usage of pooling - you could first downsample your feature maps and then Flatten them. The main advantage of this approach is its simplicity and lack of need of additional parameters. The main disadvantage : this might be a really rough method.
    2. Intelligent downsampling: here you could add a Convolutional2D layers with huge subsampling (e.g. with filter size (4, 4) and subsample (4, 4)). This might be consider as intelligent pooling. A main disadvantage of this method is additional parameters need for this approach.