Search code examples
pythonkerasdeep-learningvgg-net

Invalid Shape Error when trying to leverage Keras's VGG16 pretrained model


I am trying to leverage kera's VGG16 model in my own image classification problem. My code is heavily based upon Francois Chollet's example (Chapter 8 of Deep Learning in Python - code).

I have three classes I'm trying to predict. Directory structure:

data/
  training/
    class_1
    class_2
    class_3

Note: this my first time working with Keras so I may just be doing something wrong.

My call to model.fit() fails with: ValueError: Shapes (32, 1) and (32, 3) are incompatible. See the bottom of this question for the full error messages. If I look at the output from .summary() calls, I don't see a layer of dimension (32, 1).

import pathlib
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.utils import image_dataset_from_directory

DATA_DIR = pathlib.Path('./data/')
batch_size = 32
img_width = image_height = 256

train_dataset = image_dataset_from_directory(
    DATA_DIR / "training",
    image_size=img_width_height,
    batch_size=batch_size)

validation_dataset = image_dataset_from_directory(
    DATA_DIR / "validation",
    image_size=img_width_height,
    batch_size=batch_size)

# Found 128400 files belonging to 3 classes.
# Found 15600 files belonging to 3 classes.

vgg16_convolution_base = keras.applications.vgg16.VGG16(
    weights="imagenet",
    include_top=False,
    input_shape=(img_width, image_height, 3))

vgg16_convolution_base.summary()
# block3_conv3 (Conv2D)       (None, 64, 64, 256)       590080    
# block3_pool (MaxPooling2D)  (None, 32, 32, 256)       0         
# block4_conv1 (Conv2D)       (None, 32, 32, 512)       1180160   
# block4_conv2 (Conv2D)       (None, 32, 32, 512)       2359808   
# block4_conv3 (Conv2D)       (None, 32, 32, 512)       2359808   
# block4_pool (MaxPooling2D)  (None, 16, 16, 512)       0         
# block5_conv1 (Conv2D)       (None, 16, 16, 512)       2359808   
# block5_conv2 (Conv2D)       (None, 16, 16, 512)       2359808   
# block5_conv3 (Conv2D)       (None, 16, 16, 512)       2359808   
# block5_pool (MaxPooling2D)  (None, 8, 8, 512)         0

def get_features_and_labels(dataset):
    all_features = []
    all_labels = []
    for images, labels in dataset:
        preprocessed_images = keras.applications.vgg16.preprocess_input(images)
        features = vgg16_convolution_base.predict(preprocessed_images)
        all_features.append(features)
        all_labels.append(labels)
    return np.concatenate(all_features), np.concatenate(all_labels)

train_features, train_labels = get_features_and_labels(train_dataset)
val_features, val_labels = get_features_and_labels(validation_dataset)

print(train_features.shape)
print(train_labels.shape)
# (128400, 8, 8, 512)
# (128400,)

print(val_features.shape)
print(val_labels.shape)
# (15600, 8, 8, 512)
# (15600,)

inputs = keras.Input(shape=(8, 8, 512))

x = layers.Flatten()(inputs)
x = layers.Dense(256)(x)
x = layers.Dropout(0.5)(x)

outputs = layers.Dense(3, activation="softmax")(x)

model = keras.Model(inputs, outputs)

model.compile(loss="categorical_crossentropy",
              optimizer="rmsprop",
              metrics=["accuracy"])

model.summary()
# input_4 (InputLayer)        [(None, 8, 8, 512)]       0         
# flatten_1 (Flatten)         (None, 32768)             0         
# dense_2 (Dense)             (None, 256)               8388864   
# dropout_1 (Dropout)         (None, 256)               0         
# dense_3 (Dense)             (None, 3)                 771       
# ================================================================
# Total params: 8,389,635
# Trainable params: 8,389,635

history = model.fit(
    train_features, train_labels,
    epochs=20,
    validation_data=(val_features, val_labels)

My call to model.fit() fails with: ValueError: Shapes (32, 1) and (32, 3) are incompatible

...
File "C:\Users\x\anaconda3\lib\site-packages\keras\losses.py", line 1990, in categorical_crossentropy
        return backend.categorical_crossentropy(
    File "C:\Users\x\anaconda3\lib\site-packages\keras\backend.py", line 5529, in categorical_crossentropy
        target.shape.assert_is_compatible_with(output.shape)

full traceback


Solution

  • The categorical_crossentropy loss for 3 classes together with the batch size of 32 dictate the shape of labels (for each bach) to be (32, 3).

    The labels are currently ordinal: 0, 1, and 2. One can use the SparseCategoricalCrossentropy loss for ordinal labels:

    loss= tf.keras.losses.SparseCategoricalCrossentropy()
    

    Alternatively, one can still use the categorical_crossentropy loss, but in conjunction with the one-hot encoded labels (1, 0, 0) for 0, (0, 1, 0) for 1, and (0, 0, 1) for 2. The following code snippet can accomplish such an encoding:

    #one-hot encoding
    num_class = len(set(train_labels))
    train_labels=tf.one_hot(indices=train_labels, depth=num_class)
    val_labels=tf.one_hot(indices=val_labels, depth=num_class)
    

    The nature of data (ordered or unordered) helps determining whether one-hot encoding is preferred or ordinal.