Search code examples
tensorflowtensorflow-datasetstraining-databatchsize

Tensorflow dataset- batch_size and steps_per_epoch


enter image description hereI work on an image segmentation problem where there is a pipeline for data in the format of tensorflow dataset and uses tensorflow iterator too. Now I have increased the number of training images from 250 images before to 500 images. I have a seperate pipeline for Image augmentation. My question is

  1. Will there be an effect because of the increased number of images eventhough I use the same batch_size=16? I have set an step_per_epoch of 240. As I have seen from the logfiles in Tensorboard for each epoch the network inputs only 16 images for each step and is repeating the same batch and the images wont change during the running of a single epoch? So does it mean it trains of a single batch of only the same 16 images for all the 240 steps?

  2. I want all the number of samples to be fed into the network in each epoch means (e.g. 16*30).All the samples should be fed into the network for each epoch with a certain batch size? How is this possible?

I have attached the tensorboard image for training. I have 250 steps so for each step the image should change but it doesnot? The step number changes very rarely and so as the images. Why?


Solution

  • Without seeing your code it is hard to tell what is going on. Normally if you set the batch size to 16 and steps to 240 then in a single epoch 16 X 240 = 3840 images will be processed. If you have 500 images then you go through you complete data set 7 times plus an additional 340 images. Depending on how you constructed your input pipe line this might not be the case. Generally you want to go through your training data roughly once per epoch so I would set the steps per epoch to (500//16)+1 =32. If you want to go through the data EXACTLY once per epoch you can use the code below to find the batch size and steps per epoch. The code below is useful for determining the batch size and steps for validation data since in that case it is best to go through the validation data exactly once per epoch.

    length=500 # set this to the number of training images
    b_max= 50 # maximum batch size you will allow based on memory capacity
    batch_size=sorted([int(length/n) for n in range(1,length+1) if length % n ==0 and length/n<=b_max],reverse=True)[0]  
    steps=int(length/batch_size)