I'm trying to visualize important regions for a classification task with CNN.
I'm using VGG16 + my own top layers (A global average pooling layer and a Dense layer)
model_vgg16_conv = VGG16(weights='imagenet', include_top=False, input_shape=(100, 100, 3))
model = models.Sequential()
model.add(model_vgg16_conv)
model.add(Lambda(global_average_pooling, output_shape=global_average_pooling_shape))
model.add(Dense(4, activation = 'softmax', init='uniform'))
After compiling and fitting the model I'm trying to use Grad-CAM for a new image:
image = cv2.imread("data/example_images/test.jpg")
# Resize to 100x100
image = resize(image,(100,100),anti_aliasing=True, mode='constant')
# Because it's a grey scale image extend the dimensions
image = np.repeat(image.reshape(1,100, 100, 1), 3, axis=3)
class_weights = model.get_layer("dense_1").get_weights()[0]
final_conv_layer = model.get_layer("vgg16").get_layer("block5_conv3")
input1 = model.get_layer("vgg16").layers[0].input
output1 = model.get_layer("dense_1").output
get_output = K.function([input1], [final_conv_layer.output, output1])
After that I'm executing
[conv_outputs, predictions] = get_output([image])
Leading to the following error:
InvalidArgumentError: You must feed a value for placeholder tensor 'vgg16_input' with dtype float and shape [?,100,100,3] [[{{node vgg16_input}}]] [[dense_1/Softmax/_233]]
Additional information
def global_average_pooling(x):
return K.mean(x, axis = (2, 3))
def global_average_pooling_shape(input_shape):
return input_shape[0:2]
Model summary:
Layer (type) Output Shape Param #
=================================================================
vgg16 (Model) (None, 3, 3, 512) 14714688
_________________________________________________________________
lambda_1 (Lambda) (None, 3) 0
_________________________________________________________________
dense_1 (Dense) (None, 4) 16
=================================================================
Total params: 14,714,704
Trainable params: 16
Non-trainable params: 14,714,688
VGG-Model Summary:
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 100, 100, 3) 0
...
I'm new to Grad-CAM and I'm not sure if I'm just overseeing something or if I misunderstood the entire concept.
With Sequential, layers are added with the add() method. In this case, since the model object was directly added, there are now two inputs to the model - one via Sequential and the other via model_vgg16_conv.
>>> layer = model.layers[0]
>>> layer.get_input_at(0)
<tf.Tensor 'input_1:0' shape=(?, ?, ?, 3) dtype=float32>
>>> layer.get_input_at(1)
<tf.Tensor 'vgg16_input:0' shape=(?, ?, ?, 3) dtype=float32>
Since with the K.function, only one input was provided, there was an error about missing input for 'vgg16_input'. This would work,
get_output = K.function([input1] + [model.input], [final_conv_layer.output, output1])
[conv_outputs, predictions] = get_output([image, image])
But the functional API could be used in this scenario like this:
model_vgg16_conv = VGG16(weights='imagenet', include_top=False, input_shape=(100, 100, 3))
gavg = Lambda(global_average_pooling, output_shape=global_average_pooling_shape)(model_vgg16_conv.output)
output = Dense(4, activation = 'softmax', init='uniform')(gavg)
model_f = Model(model_vgg16_conv.input, output)
final_conv_layer = model_f.get_layer("block5_conv3")
get_output = K.function([model_f.input], [final_conv_layer.output, model_f.output])
[conv_outputs, predictions] = get_output([image])