Search code examples
pythontensorflowjupytertensorflow2.0kaggle

Tensorflow input generator ran out of data


So I was trying to implement this kaggle code into my Jupyter to test the performance of my laptop. There were some modifications of the code to fit the my version of environments:

#from scipy.ndimage import imread 
from imageio import imread

Upon block[11], i received the error as below error message

Any help or suggestions are appreciated.


Solution

  • You have specified step_per_epoch incorrectly.

    The steps_per_epoch should be equal to

    steps_per_epoch = ceil(number_of_samples / batch_size)
    

    For your case

    steps_per_epoch = ceil(1161 / 16) = ceil(72.56) = 73
    

    Try specifying steps_per_epoch = 73

    As you can your entire data is exhausted in 73 steps. Now, if you specify steps_per_epoch any higher than 73 ie 74

    There is no data available. Therefore you get input generator ran out of data

    More Information: Model training comprises of two parts forward pass and backward pass.

    1 train step = 1 forward pass + 1 backward pass
    

    A single train step(1 forward pass + 1 backward pass) is calculated on a single batch.

    So if you have 100 samples and your batch size is 10.
    Your model will have 10 train steps.

    Epoch: Epoch is defined as complete iteration over the dataset. Therefore, for your model to completely iterate over the dataset of 100 samples, it should undergo 10 train steps.

    This train step is nothing but steps_per_epoch.

    The steps_per_epoch argument is usually specified when you give infinite data generator to your fit() command and does not need to be specified if you have finite data.