I am working on image data augmentation for the train set data and I have been writing code of augmentation. I have 12 classes in the dataset i.e. Grass, Flower, Fruits, Dust, and Leaves and the total number of images is about 5539. I have split the dataset as 70% of train and 15% for both of Valid and test respectively. Train folder also consists of Grass, Flower, Fruits, Dust, and Leaves subfolders. However, after augmentation, all of the augmented data has been correctly augmented but stored somewhere in the train folder but not in their respective class subfolder
In short, for example in the train folder, I have a sub-folder i.e. Black-grass folder that has 325 image data. Afterward, I want to generate 100 augmented data in the grass folder that is already been exists in the train folder. I do not want to generate a new folder into the train folder. I want that, all augmented data will be stored in their own existed folder with their raw data
My code:
from keras.preprocessing.image import ImageDataGenerator
datagen = ImageDataGenerator(
rotation_range=45,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip=True,
fill_mode = 'nearest')
i = 0
for batch in datagen.flow_from_directory(directory = ('/content/dataset/train'),
batch_size = 32,
target_size = (256, 256),
color_mode = ('rgb'),
save_to_dir = ('/content/dataset/train'),
save_prefix = ('aug'),
save_format = ('png')):
i += 1
if i > 5:
break
solution code is shown below
import tensorflow as tf
import cv2
import os
import numpy as np
from tensorflow.keras.preprocessing.image import ImageDataGenerator
sdir= r'c:\temp\people\dtest' # set this to the directory holding the images
ext='jpg' # specify the extension foor the aufmented images
prefix='aug' #set the prefix for the augmented images
batch_size=32 # set the batch size
passes=5 # set the number of time to cycle the generator
datagen = ImageDataGenerator( rotation_range=45, width_shift_range=0.2, height_shift_range=0.2, shear_range = 0.2,
zoom_range = 0.2, horizontal_flip=True, fill_mode = 'nearest')
data=datagen.flow_from_directory(directory = sdir, batch_size = batch_size, target_size = (256, 256),
color_mode = 'rgb', shuffle=True)
for i in range (passes):
images, labels=next(data)
class_dict=data.class_indices
new_dict={}
# make a new dictionary with keys and values reversed
for key, value in class_dict.items(): # dictionary is now {numeric class label: string of class_name}
new_dict[value]=key
for j in range (len(labels)):
class_name = new_dict[np.argmax(labels[j])]
dir_path=os.path.join(sdir,class_name )
new_file=prefix + '-' +str(i*batch_size +j) + '.' + ext
img_path=os.path.join(dir_path, new_file)
img=cv2.cvtColor(images[j], cv2.COLOR_BGR2RGB)
cv2.imwrite(img_path, img)
print ('*** process complete')
this will create the augmented images and store them in the associated class directories.