Search code examples
pythontensorflowkerasmaskimage-segmentation

Segmentation mask turn into 1-dimensional array after being put into keras data generator


I was trying to create a segmentation model using a plane image dataset from https://www.kaggle.com/datasets/metavision/accurate-plane-shapessegmentation?resource=download

Both images and mask are 1280x720px, I've put them into separate image and data generators, which are later zipped into training and validation dataset. But for some reason the mask images turn into 1 dimensional array, a digit. I've been trying to just plot the images and nothing happens because I'm getting a matplotlib "TypeError: Invalid shape (1,) for image data".

I wrote 2 generators which perform with no issues

SEED = 100

train_image_generator = ImageDataGenerator(
    rescale=1./255,
    #width_shift_range = 0.1,
    #height_shift_range = 0.1,
    #rotation_range = 10,
    #zoom_range = 0.1,
    validation_split=0.2
)


train_image_flow = train_image_generator.flow_from_directory(data_directory + "images/",
                                                             batch_size = 16, 
                                                             target_size = (720, 1280),
                                                             subset='training',
                                                             seed = SEED)

train_mask_generator = ImageDataGenerator(
    rescale=1./255,
    #rescale -= 1,
    #width_shift_range = 0.1,
    #height_shift_range = 0.1,
    #rotation_range = 10,
    #zoom_range = 0.1,
    #preprocessing_function = mask_preprocessing,
    validation_split=0.2
)



train_mask_flow = train_mask_generator.flow_from_directory(data_directory +"masks/", 
                                                           batch_size = 16, 
                                                           target_size = (720, 1280), 
                                                           subset='training',
                                                           seed = SEED)


valid_image_flow = train_image_generator.flow_from_directory(data_directory + "images/",
                                                             batch_size = 16, 
                                                             target_size = (720, 1280), 
                                                             subset='validation',
                                                             seed = SEED)

valid_mask_flow = train_mask_generator.flow_from_directory(data_directory +"masks/", 
                                                           batch_size = 16, 
                                                           target_size = (720, 1280), 
                                                           subset='validation',
                                                           seed = SEED)


print(train_mask_flow[0][1])

def my_image_mask_generator(image_data_generator, mask_data_generator): 
    train_generator = zip(image_data_generator, mask_data_generator)
    for (img, mask) in train_generator:
        yield (img, mask)

train_generator = my_image_mask_generator(train_image_flow, train_mask_flow)
valid_generator = my_image_mask_generator(valid_image_flow, valid_mask_flow)

But for some reason the mask of every images turn into a one-digit array. When I print the mask array I just get [[1,]]. If i try to plot the image and mask, i get "ValueError: Expected image array to have rank 3 (single image). Got array with shape: (1,)"

enter image description here

I'm a total newbie in tensorflow, but this issues seems really strange.


Solution

  • In the line in the attached image (don't attach code as images by the way):

    for images, masks in next(train_generator):
    

    both images and masks are returned in the variable images - check the shape. The value you're getting in masks corresponds to the class of the image. If you put class_mode=None as an option in your flow_from_directory methods, you should get the output you expect.

    Alternatively, your images are probably in images[0,:,:,:], and your masks are in images[1,:,:,:].