How large can a TensorFlow .record file be before it causes performance issues?

In the TensorFlow Object Detection API, they advocate sharding if the dataset contains "more than a few thousand examples", noting that:

tf.data.Dataset API can read input examples in parallel improving throughput.
tf.data.Dataset API can shuffle the examples better with sharded files which improves performance of the model slightly.

A few thousand is a bit vague, and it would be nice to have a more precise answer, such as a file size. In other words, how big can a .record file before it starts causing performance issues? What file size should we aim for when sharding our data?

Solution

It seems like the TensorFlow team recommends ~100MB shards. https://www.tensorflow.org/guide/performance/overview You might also consider the performance implications related to batch size while training. https://www.pugetsystems.com/labs/hpc/GPU-Memory-Size-and-Deep-Learning-Performance-batch-size-12GB-vs-32GB----1080Ti-vs-Titan-V-vs-GV100-1146/