What is the concept of mini-batch for FCN (semantic segmentation)?

What is the concept of mini-batch when we are sending one image to FCN for semantic segmentation?

The default value in data layers is batch_size: 1. That means every forward and backward pass, one image is sent to the network. So what will be the mini-batch size? Is it the number of pixels in an image?

The other question is what if we send few images together to the net? Does it affect the convergence? In some papers, I see the number of 20 images.

Thanks

Solution

The batch size is the number of images sent through the network in a single training operation. The gradient will be calculated for all the sample in one swoop, resulting in large performance gains through parallelism, when training on a graphics card or cpu cluster.

The batch sizes has multiple effects on training. First it provides more stable gradient updates by averaging the gradient in the batch. This can both be beneficial and detrimental. In my experience it was more beneficial then detrimental, but others have reported other results.

To exploit parallelism the batch size is mostly a power of 2. So either 8, 16, 32, 64 or 128. Finally the batch size is limited by VRAM in the graphics card. The card needs to store all the images and results in all the nodes of the graph and additionally all the gradients.

This can blow up very fast. In this case you need to reduce the batch size or the network size.