Consider transfer learning in order to use a pretrained model in keras/tensorflow. For each old layer, trained
parameter is set to false
so that its weights are not updated during training whereas the last layer(s) have been substituted with new layers and these must be trained. Particularly two fully connected hidden layers with 512
and 1024
neurons and and relu activation function have been added. After these layers a Dropout layer is used with rate
0.2
. This means that during each epoch of training 20%
of the neurons are randomly discarded.
What layers does this dropout layer affect? Does it affect all the network including also the pretrained layers for which layer.trainable=false
has been set or does it affect only the newly added layers? Or does it affect only the previous layer (i.e., the one with 1024
neurons)?
In other words, which layer(s) do the neurons that are turned off during each epoch by the dropout belong to?
import os
from tensorflow.keras import layers
from tensorflow.keras import Model
from tensorflow.keras.applications.inception_v3 import InceptionV3
local_weights_file = 'weights.h5'
pre_trained_model = InceptionV3(input_shape = (150, 150, 3),
include_top = False,
weights = None)
pre_trained_model.load_weights(local_weights_file)
for layer in pre_trained_model.layers:
layer.trainable = False
# pre_trained_model.summary()
last_layer = pre_trained_model.get_layer('mixed7')
last_output = last_layer.output
# Flatten the output layer to 1 dimension
x = layers.Flatten()(last_output)
# Add two fully connected layers with 512 and 1,024 hidden units and ReLU activation
x = layers.Dense(512, activation='relu')(x)
x = layers.Dense(1024, activation='relu')(x)
# Add a dropout rate of 0.2
x = layers.Dropout(0.2)(x)
# Add a final sigmoid layer for classification
x = layers.Dense (1, activation='sigmoid')(x)
model = Model( pre_trained_model.input, x)
model.compile(optimizer = RMSprop(lr=0.0001),
loss = 'binary_crossentropy',
metrics = ['accuracy'])
The dropout layer will affect the output of the previous layer.
If we look at the specific part of your code:
x = layers.Dense(1024, activation='relu')(x)
# Add a dropout rate of 0.2
x = layers.Dropout(0.2)(x)
# Add a final sigmoid layer for classification
x = layers.Dense (1, activation='sigmoid')(x)
In your case, 20% of the output of the layer defined by x = layers.Dense(1024, activation='relu')(x)
will be dropped at random, before being passed to the final Dense
layer.