What is the least total batch size for SyncBatchNorm

For the normal BatchNorm, the least batch size per GPU is 2.

I wonder if I use the SyncBatchNorm, can I use batch_size=1 for every GPU with more than a single GPU?

I.e, the total_batch_size is more than 1 but batch_size_per_gpu is 1.

I would appreciate answers for any deep learning framework, pytorch, tensorflow, mxnet, etc

Solution

For PyTorch, using batch_size_per_gpu=1 and more than one GPU is fine.