Search code examples

Why does keras test generator only return batch size as the length in the shape of the array?

Here is my test generator code:


Why does batch_size parameter still matter after you create the generator:

Found 229 validated image filenames belonging to 2 classes.

For example, the shape of the array after the generator is created is limited to 32 - the batch size:

x_test, y_test =

here is the shape of x_test, I'm assuming this is the array with the actual image data:

>>> print(x_test.shape)
(32, 224, 224, 3)

This is the result when I compare it to the length of the predictions:

print(len(x_test))  #32
print(len(y_test))  #32
print(len(pred))    #229

Since the size of the y_test is vastly different than the predictions, I'm having difficulty doing any sort of comparisons. The y_test is directly related to the test_generator that has the batch size set at 32.

The test generator labels seems to have the right number of elements:


[0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0........

So why is the shape of x_test only 32? I am obviously thinking incorrectly that it should be 229, since there are 229 samples, 229 labels?


  • As the docs here state, what a generator returns is:

    A DataFrameIterator yielding tuples of (x, y) where x is a numpy array containing a batch of images with shape (batch_size, target_size, channels) and y is a numpy array of corresponding labels.

    So, test_generator is a DataFrameIterator, which each time you call it, it will give you a batch of images with shape (32, 224, 224, 3). Therefore you are thinking incorrectly that it should be 229, since there are 229 samples. Each time it will give you a batch of 32 images out of 229 sample.