I am unable to understand the logic behind getting the output shape of the first hidden layer. I have taken some arbitrary examples as follows;
Example 1:
model.add(Dense(units=4,activation='linear',input_shape=(784,)))
model.add(Dense(units=10,activation='softmax'))
model.summary()
Model: "sequential_4"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_7 (Dense) (None, 4) 3140
_________________________________________________________________
dense_8 (Dense) (None, 10) 50
=================================================================
Total params: 3,190
Trainable params: 3,190
Non-trainable params: 0
Example 2:
model.add(Dense(units=4,activation='linear',input_shape=(784,1)))
model.add(Dense(units=10,activation='softmax'))
model.summary()
Model: "sequential_6"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_11 (Dense) (None, 784, 4) 8
_________________________________________________________________
dense_12 (Dense) (None, 784, 10) 50
=================================================================
Total params: 58
Trainable params: 58
Non-trainable params: 0
Example 3:
model.add(Dense(units=4,activation='linear',input_shape=(32,28)))
model.add(Dense(units=10,activation='softmax'))
model.summary()
Model: "sequential_8"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_15 (Dense) (None, 32, 4) 116
_________________________________________________________________
dense_16 (Dense) (None, 32, 10) 50
=================================================================
Total params: 166
Trainable params: 166
Non-trainable params: 0
Example 4:
model.add(Dense(units=4,activation='linear',input_shape=(32,28,1)))
model.add(Dense(units=10,activation='softmax'))
model.summary()
Model: "sequential_9"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_17 (Dense) (None, 32, 28, 4) 8
_________________________________________________________________
dense_18 (Dense) (None, 32, 28, 10) 50
=================================================================
Total params: 58
Trainable params: 58
Non-trainable params: 0
Please help me in understanding the logic.
Also, I think the rank of input_shape=(784,)
and input_shape=(784,1)
is the same then why is their Output Shape
different?
According to the official documentation of Keras, for Dense layer when you give input as input_shape=(input_units,)
the modal take as input arrays of shape (*, input_units)
and outputs arrays of shape (*, output_units)
[in your case input_shape=(784,)
is treated as input shape=(*, 784)
and output is output_shape=(*,4)
]
In general for input dimension of (batch_size, ..., input_dim)
, the modal gives the output of size (batch_size, ..., units)
.
So when you give input as input_shape=(784,)
the modal take as input arrays of shape (*, 784)
, where *
is the batch size and 784
as input_dim, giving output shape as (*, 4)
.
When the input is (784,1)
, the modal takes it as (*, 784, 1)
where *
is the batch size, 784
is ...
and 1
is input_dim =>(batch_size, ..., input_dim)
and output as (*, 784, 4)
=> (batch_size, ..., units)
.
Same goes for the input_shape=(32,28)=>(*,32,28)
, giving output (*,32,4)
and for input with input_shape=(32,28,1)=>(*,32,28,1)
where again *
is the batch_size, 32,28
is ...
and 1
is the input_dim =>(batch_size, ..., input_dim)
On what does None means please check What is the meaning of the "None" in model.summary of KERAS?