I am doing classification over mnist dataset using keras. I am interested in doing some operation on weight matrix generated after the training but some layers weight matrix looks like they are not fully connected.
model = Sequential()
model.add(Dense(1000, input_shape = (train_x.shape[1],), activation='relu' ))
model.add(Dense(1000, activation='relu'))
model.add(Dense(500, activation='relu'))
model.add(Dense(200, activation='relu'))
model.add(Dense(10, activation='softmax'))
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics = ['accuracy'])
model.fit(train_x,train_y, epochs=10, validation_data= (test_x,test_y))
w = model.get_weights()
for i in range(5):
print(w[i].shape)
now, when I print the dimensions of the weight matrix of each layer I get the following result
(784, 1000)
(1000,)
(1000, 1000)
(1000,)
(1000, 500)
why 2nd has (1000,) and not (1000,1000)?
Because it is bias. Don't forget that layer is defined by (sometimes also written as ).
Suppose the shape of x
is (None, 784)
and the shape of weights w
is (784, 1000)
. The matmul(x, w)
operation results in the shape (None, 1000)
. To the resulted tensor of this shape you're adding bias of shape (1000, )
which is broadcasted along the None
dimension.