python machine-learning keras deep-learning keras-layer

Keras Dense layer Output Shape

I am unable to understand the logic behind getting the output shape of the first hidden layer. I have taken some arbitrary examples as follows;

Example 1:

model.add(Dense(units=4,activation='linear',input_shape=(784,)))  
model.add(Dense(units=10,activation='softmax'))
model.summary()

Model: "sequential_4"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_7 (Dense)              (None, 4)                 3140      
_________________________________________________________________
dense_8 (Dense)              (None, 10)                50        
=================================================================
Total params: 3,190
Trainable params: 3,190
Non-trainable params: 0

Example 2:

model.add(Dense(units=4,activation='linear',input_shape=(784,1)))   
model.add(Dense(units=10,activation='softmax'))
model.summary()
Model: "sequential_6"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_11 (Dense)             (None, 784, 4)            8         
_________________________________________________________________
dense_12 (Dense)             (None, 784, 10)           50        
=================================================================
Total params: 58
Trainable params: 58
Non-trainable params: 0

Example 3:

model.add(Dense(units=4,activation='linear',input_shape=(32,28)))    
model.add(Dense(units=10,activation='softmax'))
model.summary()
Model: "sequential_8"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_15 (Dense)             (None, 32, 4)             116       
_________________________________________________________________
dense_16 (Dense)             (None, 32, 10)            50        
=================================================================
Total params: 166
Trainable params: 166
Non-trainable params: 0

Example 4:

model.add(Dense(units=4,activation='linear',input_shape=(32,28,1)))    
model.add(Dense(units=10,activation='softmax'))
model.summary()
Model: "sequential_9"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_17 (Dense)             (None, 32, 28, 4)         8         
_________________________________________________________________
dense_18 (Dense)             (None, 32, 28, 10)        50        
=================================================================
Total params: 58
Trainable params: 58
Non-trainable params: 0

Please help me in understanding the logic.

Also, I think the rank of input_shape=(784,) and input_shape=(784,1) is the same then why is their Output Shape different?

Solution

According to the official documentation of Keras, for Dense layer when you give input as input_shape=(input_units,) the modal take as input arrays of shape (*, input_units) and outputs arrays of shape (*, output_units) [in your case input_shape=(784,) is treated as input shape=(*, 784) and output is output_shape=(*,4)]

In general for input dimension of (batch_size, ..., input_dim), the modal gives the output of size (batch_size, ..., units).

So when you give input as input_shape=(784,) the modal take as input arrays of shape (*, 784), where * is the batch size and 784 as input_dim, giving output shape as (*, 4).

When the input is (784,1), the modal takes it as (*, 784, 1) where * is the batch size, 784 is ... and 1 is input_dim =>(batch_size, ..., input_dim) and output as (*, 784, 4) => (batch_size, ..., units).

Same goes for the input_shape=(32,28)=>(*,32,28), giving output (*,32,4) and for input with input_shape=(32,28,1)=>(*,32,28,1) where again * is the batch_size, 32,28 is ... and 1 is the input_dim =>(batch_size, ..., input_dim)

On what does None means please check What is the meaning of the "None" in model.summary of KERAS?