Search code examples
python-3.xmachine-learningkerasdeep-learningkeras-layer

Keras model summary incorrect


I am doing data augmentation using

data_gen=image.ImageDataGenerator(rotation_range=20,width_shift_range=0.2,height_shift_range=0.2,
                                  zoom_range=0.15,horizontal_flip=False)

iter=data_gen.flow(X_train,Y_train,batch_size=64)

data_gen.flow() needs a rank 4 data matrix, so the shape of X_train is(60000, 28, 28, 1). We need to pass the same shape i.e (60000, 28, 28, 1)while defining the architecture of the model as follows;

model=Sequential()
model.add(Dense(units=64,activation='relu',kernel_initializer='he_normal',input_shape=(28,28,1)))
model.add(Flatten())    
model.add(Dense(units=10,activation='relu',kernel_initializer='he_normal'))
model.summary()

model.add(Flatten()) was used to handle the rank-2 problem. Now the problem is with model.summary(). It is giving incorrect output as shown below;

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_1 (Dense)              (None, 28, 28, 64)        128       
_________________________________________________________________
flatten_1 (Flatten)          (None, 50176)             0         
_________________________________________________________________
dense_2 (Dense)              (None, 10)                501770    
=================================================================
Total params: 501,898
Trainable params: 501,898
Non-trainable params: 0

The Output Shape for dense_1 (Dense) should be (None,64) and Param # should be (28*28*64)+64 i.e 50240. The Output Shape for dense_2 (Dense) is correct but the Param # should be (64*10)+10i.e 650.

Why is this happening and how can this problem be addressed?


Solution

  • The summary is not incorrect. The keras Dense layer always works on the last dimension of the input.

    ref: https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dense

    Input shape:

    N-D tensor with shape: (batch_size, ..., input_dim). The most common situation would > be a 2D input with shape (batch_size, input_dim). Output shape:

    N-D tensor with shape: (batch_size, ..., units). For instance, for a 2D input with shape (batch_size, input_dim), the output would have shape (batch_size, units).

    Before each Dense layer, you need to manually apply Flatten() to make sure you're passing 2-d data.

    One work-around for your desired output_shape is:

    model=Sequential()
    model.add(Dense(units=1,activation='linear', use_bias = False, trainable = False, kernel_initializer=tf.keras.initializers.Ones(),input_shape=(28,28,1)))
    model.add(Flatten())
    model.add(Dense(units=64,activation='relu'))    
    model.add(Dense(units=10,activation='relu',kernel_initializer='he_normal'))
    model.summary()
    

    The first layer is just one layer, initialized with ones, with no bias, so it just multiplies the input with one and passes to the next layer to be flattened. This removes unnecessary parameters from the model.

    Model: "sequential"
    _________________________________________________________________
    Layer (type)                 Output Shape              Param #   
    =================================================================
    dense (Dense)                (None, 28, 28, 1)         2         
    _________________________________________________________________
    flatten (Flatten)            (None, 784)               0         
    _________________________________________________________________
    dense_1 (Dense)              (None, 64)                50240     
    _________________________________________________________________
    dense_2 (Dense)              (None, 10)                650       
    =================================================================
    Total params: 50,892
    Trainable params: 50,892
    Non-trainable params: 0