Search code examples
pythontensorflowimage-processingkerasconv-neural-network

Random orthogonal, 90 degrees rotation with ImageDataGenerator


I use following code to train my CNN model with invoice images.

train_datagen = ImageDataGenerator( 
                rescale = 1. / 255, 
                 shear_range = 0.2, 
                  zoom_range = 0.2, 
            horizontal_flip = True
            ) 

test_datagen = ImageDataGenerator(rescale = 1. / 255) 

train_generator = train_datagen.flow_from_directory(train_data_dir, 
                              target_size =(img_width, img_height), 
                     batch_size = batch_size) 

validation_generator = test_datagen.flow_from_directory( 
                                    validation_data_dir, 
                   target_size =(img_width, img_height), 
          batch_size = batch_size) 

model.fit_generator(train_generator, 
    steps_per_epoch = nb_train_samples // batch_size, 
    epochs = epochs, validation_data = validation_generator, 
    validation_steps = nb_validation_samples // batch_size) 

The problem is I used only upright images in my training data set. All my images are like following image:

An upright image that was used during the training

After the training when I want to send an image like below, my model fails to predict its right class.

wrong predicted image

As you see below, I send horizontal_flip = True to ImageDataGenerator

train_datagen = ImageDataGenerator( 
                rescale = 1. / 255, 
                 shear_range = 0.2, 
                  zoom_range = 0.2, 
            horizontal_flip = True
            )

How can I change my code so that it can predict even flipped images. Or should I use manually flipped images within my training dataset?


Solution

  • I would rotate the images randomly with ImageDataGenerator. Just specify the following argument:

    rotation_range: Int. Degree range for random rotations.

    Or, you can pass a preprocessing function to ImageDataGenerator which gives you more flexibility.

    def orthogonal_rot(image):
        return np.rot90(image, np.random.choice([-1, 0, 1]))
    
    train_generator = ImageDataGenerator(
        preprocessing_function=orthogonal_rot)
    

    This function will rotate by either -90, 0, or 90 degrees.

    (The np.rot90() function is rotating the image 90 degrees times the second parameter. Accordingly -1 is -90 degrees, 0 is no rotation, 1 is 90 degrees and 2 would be 180 degrees.)