Search code examples
pythonnumpymachine-learningdeep-learningdata-augmentation

Fast Dataset Augmentation in Python- Deep Learning


I am working on a project where there is a need for data augmentation. I wanted to flip the image horizontally and add that to the training data array. The problem is that there is over 10,000 images.

This is the code for manually flipping each image (a 2d numpy array) in the array train_images of length 'size'.

for img in range(0, size):
  flip = np.flip(train_images[img], axis=1)
  np.append(train_images, flip)
  np.append(train_labels, train_labels[img])

This is taking quite a long time. Is there any library function or faster way to compute the new images and add them to the array without multi-threading?

Thank you in advance for your comments.


Solution

  • I use the imgaug library for data augmentation in a systematic manner. It is very useful and has a great design in case you need to do multiple augmentations to the same image. I does have a bit of a learning curve, but is very much worth it.