I see in explanation a TFRecord contains multiple classes and multiple images (a cat and a bridge). When it was written, both images are written into one TFRecord. During the read back, it is verified that this TFRecord contains two images.
Elsewhere I have seen people generating one TFRecord per image, I know you can load multiple TFRecord files like this:
train_dataset = tf.data.TFRecordDataset("<Path>/*.tfrecord")
But which way is recommended? should I build one tfrecord per image, or one tfrecord for multiple images? If put multiple images into one tfrecord, then how many is maximum?
As you said, it is possible to save an arbitrary amount of entries in a single TFRecord
file, and one can create as many TFRecord
files as desired.
I would recommend using practical considerations to decide how to proceed:
TFRecord
files for easier handling moving files in the filesystemTFRecord
files to a size that can become a problem for filesystemTFRecord
files for train / validation / test splitTFRecord
file per participant session)