Search code examples
machine-learningkerasneural-network

Explanation about layers in machine learning model


I have printed out the number of layers in my model ,the code works fine i just want to understand the output actually means,the output is as follows.

    Number of layers: 30
Layer types:
input_1 - InputLayer
conv2d - Conv2D
batch_normalization - BatchNormalization
activation - Activation
conv2d_1 - Conv2D
batch_normalization_1 - BatchNormalization
activation_1 - Activation
conv2d_2 - Conv2D
batch_normalization_2 - BatchNormalization
add - Add
activation_2 - Activation
conv2d_3 - Conv2D
batch_normalization_3 - BatchNormalization
activation_3 - Activation
conv2d_4 - Conv2D
conv2d_5 - Conv2D
batch_normalization_4 - BatchNormalization
add_1 - Add
activation_4 - Activation
conv2d_6 - Conv2D
batch_normalization_5 - BatchNormalization
activation_5 - Activation
conv2d_7 - Conv2D
conv2d_8 - Conv2D
batch_normalization_6 - BatchNormalization
add_2 - Add
activation_6 - Activation
average_pooling2d - AveragePooling2D
flatten - Flatten
dense - Dense

Does this mean my model has 6 layers with each having five hidden layers?are weight matrices constructed only for the six layers as whole?or would the weight matrices be constructed for 30 layers when i run the model? How should i interpret the above output.

I have used model.summary() from keras - here is my output:

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            [(None, 32, 32, 3)]  0                                            
__________________________________________________________________________________________________
conv2d (Conv2D)                 (None, 32, 32, 16)   448         input_1[0][0]                    
__________________________________________________________________________________________________
batch_normalization (BatchNorma (None, 32, 32, 16)   64          conv2d[0][0]                     
__________________________________________________________________________________________________
activation (Activation)         (None, 32, 32, 16)   0           batch_normalization[0][0]        
__________________________________________________________________________________________________
conv2d_1 (Conv2D)               (None, 32, 32, 16)   2320        activation[0][0]                 
__________________________________________________________________________________________________
batch_normalization_1 (BatchNor (None, 32, 32, 16)   64          conv2d_1[0][0]                   
__________________________________________________________________________________________________
activation_1 (Activation)       (None, 32, 32, 16)   0           batch_normalization_1[0][0]      
__________________________________________________________________________________________________

conv2d_2 (Conv2D)               (None, 32, 32, 16)   2320        activation_1[0][0]               
    __________________________________________________________________________________________________
    batch_normalization_2 (BatchNor (None, 32, 32, 16)   64          conv2d_2[0][0]                   
    __________________________________________________________________________________________________
    add (Add)                       (None, 32, 32, 16)   0           activation[0][0]                 
                                                                     batch_normalization_2[0][0]      
    __________________________________________________________________________________________________
    activation_2 (Activation)       (None, 32, 32, 16)   0           add[0][0]                        
    __________________________________________________________________________________________________
    conv2d_3 (Conv2D)               (None, 16, 16, 32)   4640        activation_2[0][0]               
    __________________________________________________________________________________________________
    batch_normalization_3 (BatchNor (None, 16, 16, 32)   128         conv2d_3[0][0]                   
    __________________________________________________________________________________________________
    activation_3 (Activation)       (None, 16, 16, 32)   0           batch_normalization_3[0][0]      
    __________________________________________________________________________________________________
    conv2d_4 (Conv2D)               (None, 16, 16, 32)   9248        activation_3[0][0]               
    __________________________________________________________________________________________________
    conv2d_5 (Conv2D)               (None, 16, 16, 32)   544         activation_2[0][0]               
    __________________________________________________________________________________________________
    batch_normalization_4 (BatchNor (None, 16, 16, 32)   128         conv2d_4[0][0]                   
    __________________________________________________________________________________________________
    add_1 (Add)                     (None, 16, 16, 32)   0           conv2d_5[0][0]                   
                                                                     batch_normalization_4[0][0]      
    __________________________________________________________________________________________________
    activation_4 (Activation)       (None, 16, 16, 32)   0           add_1[0][0]                      
    __________________________________________________________________________________________________
    conv2d_6 (Conv2D)               (None, 8, 8, 64)     18496       activation_4[0][0]               
    __________________________________________________________________________________________________

Solution

  • In tensorflow.keras the definition of a layer isn't the same as you might expect from literature. A layer is anything that is a subclass of tensorflow.keras.layers.Layer.

    When constructing your model you start with an Input Layer this has no parameters (weights). Then you have a Conv2d Layer, this does have parameters, then a BatchNormalization Layer, this just performs the normalization - it doesn't have any parameters, similarly the activation layer doesn't have any trainable parameters (but does use memory, so has some non trainable parameters).

    In the model you have constructed only the Dense and Conv2D layers have trainable parameters, the rest just perform mathematical operations that are not trainable.

    You can use model.summary() to print the layers, which shows you which layers contribute to parameter count as well as showing you the network connectivity - it's a very useful way to debug your model.