Search code examples
pythonkerascastingimage-segmentationdata-augmentation

Cast ImageDataGenerator Data Output


I'm writing a network for Image Segmentation. I have my ImageDataGenerator for my masks (which are RGB images with only 0 and 255 as values, black and white) which is:

train_mask_data_gen = ImageDataGenerator(rotation_range=10,
                                         width_shift_range=10,
                                         height_shift_range=10,
                                         zoom_range=0.3,
                                         horizontal_flip=True,
                                         vertical_flip=True,
                                         fill_mode='nearest',#interpolation used for augmenting the image
                                         cval=0,
                                         rescale=1./255)

And flow_from_directory:

train_mask_gen = train_mask_data_gen.flow_from_directory(os.path.join(training_dir, 'masks'),
                                                     target_size=(img_h, img_w),
                                                     batch_size=bs,
                                                     class_mode=None, # Because we have no class subfolders in this case
                                                     shuffle=True,
                                                     interpolation='nearest',#interpolation used for resizing
                                                     #color_mode='grayscale',
                                                     seed=SEED)

The code works fine, the only problem is that, when i'm applying data augmentation to the masks, i won't have binary images anymore, but i get some values between 0 and 1 (normalized). For example, if i print my output matrix (the image) i get something like this:

 [[0.         0.         0.        ]


[0.         0.         0.        ]
   [0.         0.         0.        ]
   ...
   [1.         1.         1.        ]
   [1.         1.         1.        ]
   [1.         1.         1.        ]]

  ...

  [[0.         0.         0.        ]
   [0.3457849  0.3457849  0.3457849 ]
   [1.         1.         1.        ]
   ...
   [0.         0.         0.        ]
   [0.         0.         0.        ]
   [0.         0.         0.        ]]

Which contains also those "extra" values due to augmentation. If i don't apply any augmentation i get binary images as i wanted.

How can i embedd the casting to integer? (in order to get values which are only 0 or 1) I tried to use the field dtype=int in the ImageDataGenerator, but it doesn't do anything, i keep getting the same results.


Solution

  • setting the dtype argument to 'uint8' worked for me:

    Original:

    datagen = ImageDataGenerator(dtype = 'float32')
    val_set = datagen.flow_from_directory(data_dir, batch_size=1, target_size = (257,144))
    

    Output:

    [[[ 52.  58.  61.]
    [ 53.  53.  61.]
    [ 54.  57.  66.]
    ...
    [  5.  12.   0.]
    [ 19.  26.  12.]
    [  1.  15.   0.]]]
    

    New:

    datagen = ImageDataGenerator(dtype = 'uint8')
    val_set = datagen.flow_from_directory(data_dir, batch_size=1, target_size = (257,144))
    

    output:

       [[[ 52  58  61]
       [ 53  53  61]
       [ 54  57  66]
       ...
       [  5  12   0]
       [ 19  26  12]
       [  1  15   0]]]