I have a very small dataset and I need to do data augmentation. I'm using Keras and I have issues understanding how this approach could help me.
I looked at some tutorials, they suggest adding layer to the model to do data augmentation.
data_augmentation = tf.keras.Sequential([
layers.experimental.preprocessing.RandomFlip("horizontal_and_vertical"),
layers.experimental.preprocessing.RandomRotation(0.2),
])
model = Sequential()#add model layers
model.add(data_augmentation)
....
My question is: how can data augmentation help me with a small dataset, if I pass to model.fit N images contained in my dataset these images will only be flipped or rotated, I will not have two similar images: an original one and one flipped, for example.
Should I first save the augmented images?
In my code I'm following this tutorial option 1 https://www.tensorflow.org/tutorials/images/data_augmentation
During training the model without augmentation processes the images as they are in your data set. When you add augmentation an input image is randomly selected to be transformed into a different image and used as input into the model. For example if you have a cat image and it is randomly selected to be horizontally flipped then the model is trained sometimes with the image not flipped and sometimes with the image flipped. So your model sees a wider distribution of input images. It is possible to store the transformed images using the ImageDataGenerator.flow or ImageDataGenerator.flow_from_directory. Documentation is here.. The saved transformed images then can be added into the set of input data.