Search code examples
pythondeep-learningcomputer-visionvgg-netimagenet

ValueError: Error when checking input: expected input_2 to have shape (224, 224, 3) but got array with shape (224, 224, 4)


I've taken the input from the folder and then reshaped it accordingly as per the model VGG16-places365. It is still showing the same error and looked into the Keras documentation of the problem (https://keras.io/applications/#vgg16) yet the error still prevails.

if __name__ == '__main__':
    #from urllib.request import urlopen
    import numpy as np
    from PIL import Image
    from cv2 import resize
    pred_array = np.empty((0,6),dtype=float)
    TEST_PATH = '/home/guest/Downloads/content/image/thumb'
    for img in os.listdir(TEST_PATH):
        image = Image.open(os.path.join(TEST_PATH, img))
        image = np.array(image, dtype=np.uint8)
        image = resize(image, (224, 224))
        image = np.expand_dims(image, 0)

    model = VGG16_Places365(weights='places')
    predictions_to_return = 5
    preds = model.predict(image)[0]
    top_preds = np.argsort(preds)[::-1][0:predictions_to_return]

    # load the class label
    file_name = 'categories_places365.txt'
    if not os.access(file_name, os.W_OK):
        synset_url = 'https://raw.githubusercontent.com/csailvision/places365/master/categories_places365.txt'
        os.system('wget ' + synset_url)
    classes = list()
    with open(file_name) as class_file:
        for line in class_file:
            classes.append(line.strip().split(' ')[0][3:])
    classes = tuple(classes)
    temprow = np.hstack((np.array([img]),top_preds))
    np.append(pred_array,temprow.reshape(-1,pred_array.shape[1]),axis=0)
df = pd.DataFrame(data=pred_array,columns=['File_name','Tag_1','Tag_2','Tag_3','Tag_4','Tag_5'])    
print(df)

Solution

  • You are probably loading an image with an alpha channel (RGBA) but the VGG16 neural network expects an image without an alpha channel (RGB).

    To convert the image from RGBA to RGB, you can either use

    image = image.convert("RGB")
    

    on the PIL Image object, i.e. directly after Image.open, or use numpy array slicing on the numpy array object to cut off the first three color channels after np.array has been called:

    image = image[:, :, :3]