Search code examples
pythonnumpytensorflowkeras

ImageDataGenerator regarding length of inputs


I have a list of features containing 3D numpy arrays representing a RGB image as well as a list of labels which are 1D numpy arrays. I want to perform image augmentation using the ImageDataGenerator.flow(). When passing the two lists into .flow(), I get:

ValueError: `x` (images tensor) and `y` (labels) should have the same length. Found: x.shape = (32, 32, 3), y.shape = (2, 4)

However, when I print x.shape and y.shape, it tells me that they are the same length. Here is my code:

dategen = ImageDataGenerator(
    rescale = 1./255
)


labels = list(df["label"])
filepaths = list(df["photo"])

features = []

for filepath in filepaths:
    img = Image.open(filepath)
    img = img.resize((32,32))
    features.append(np.array(img))

features = np.array(features)
labels = np.array(labels)

print(features.shape) #Prints (2, 32, 32, 3)
print(labels.shape)   #Prints (2, 4)

train_gen = datagen.flow(
    x = features,
    y = labels,
    batch_size = 1
)

I thought that perhaps the first dimension of the features array has been trimmed so I tried expanding the first dimension of the features array

features = np.expand_dims(features, axis = 0)
print(features.shape) #Prints (1, 2, 32, 32, 3)

But then I get this error:

ValueError: ... Found: x.shape = (1, 2, 32, 32, 3), ...

What is going on?


Solution

  • I solved it by adding this line of code. The shape is still the same but somehow it works after running this.

    labels = np.reshape(labels, (features.shape[0], -1))