Search code examples
pythontensorflowkerasdeep-learningdata-augmentation

augmented images does not store in their own classes directory with raw data that is presented into the train folder


I am working on image data augmentation for the train set data and I have been writing code of augmentation. I have 12 classes in the dataset i.e. Grass, Flower, Fruits, Dust, and Leaves and the total number of images is about 5539. I have split the dataset as 70% of train and 15% for both of Valid and test respectively. Train folder also consists of Grass, Flower, Fruits, Dust, and Leaves subfolders. However, after augmentation, all of the augmented data has been correctly augmented but stored somewhere in the train folder but not in their respective class subfolder

In short, for example in the train folder, I have a sub-folder i.e. Black-grass folder that has 325 image data. Afterward, I want to generate 100 augmented data in the grass folder that is already been exists in the train folder. I do not want to generate a new folder into the train folder. I want that, all augmented data will be stored in their own existed folder with their raw data

My code:

from keras.preprocessing.image import ImageDataGenerator

datagen = ImageDataGenerator(
    rotation_range=45,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range = 0.2,
    zoom_range = 0.2, 
    horizontal_flip=True,
    fill_mode = 'nearest')

i = 0

for batch in datagen.flow_from_directory(directory = ('/content/dataset/train'),
                                         batch_size = 32,
                                         target_size = (256, 256),
                                         color_mode = ('rgb'),
                                         save_to_dir = ('/content/dataset/train'),
                                         save_prefix = ('aug'),
                                         save_format = ('png')):
  i += 1
  if i > 5:
    break

Using Platform: Google collaboratory Directory picture


Solution

  • solution code is shown below

    import tensorflow as tf
    import cv2
    import os
    import numpy as np
    from tensorflow.keras.preprocessing.image import ImageDataGenerator
    sdir= r'c:\temp\people\dtest' # set this to the directory holding the images
    ext='jpg' # specify the extension foor the aufmented images
    prefix='aug' #set the prefix for the augmented images
    batch_size=32 # set the batch size
    passes=5  # set the number of time to cycle the generator
    datagen = ImageDataGenerator( rotation_range=45, width_shift_range=0.2, height_shift_range=0.2, shear_range = 0.2,
                                    zoom_range = 0.2,  horizontal_flip=True, fill_mode = 'nearest')
    data=datagen.flow_from_directory(directory = sdir, batch_size = batch_size,  target_size = (256, 256),
                                     color_mode = 'rgb', shuffle=True)
    for i in range (passes):
        images, labels=next(data)
        class_dict=data.class_indices
        new_dict={}
        # make a new dictionary with keys and values reversed
        for key, value in class_dict.items(): # dictionary is now {numeric class label: string of class_name}
            new_dict[value]=key    
        for j in range (len(labels)):                
            class_name = new_dict[np.argmax(labels[j])]         
            dir_path=os.path.join(sdir,class_name )         
            new_file=prefix + '-' +str(i*batch_size +j) + '.'  + ext       
            img_path=os.path.join(dir_path, new_file)        
            img=cv2.cvtColor(images[j], cv2.COLOR_BGR2RGB)
            cv2.imwrite(img_path, img)
    print ('*** process complete')  
    

    this will create the augmented images and store them in the associated class directories.