Search code examples
pythontensorflowmachine-learningkerasimage-segmentation

Adjust Image Segmentaion model (from TF tutorial) for binary masking


I need an Image Segmentation model for Tensorflow. Input is Image and Mask(binary, masked or non-masked), and output is image mask with 0 and 1.

I followed the Image segmentation tutorial from https://www.tensorflow.org/tutorials/images/segmentation

But now I want to run it for binary mask (without border class) on my dataset The new dataset is prepared and inputted to the model.fit. It must be fine.

How do I change this model to only 2 classes (non-masked and masked)?

base_model: keras.Model = tf.keras.applications.MobileNetV2(input_shape=[128, 128, 3], include_top=False)

# Use the activations of these layers
layer_names = [
    'block_1_expand_relu',   # 64x64
    'block_3_expand_relu',   # 32x32
    'block_6_expand_relu',   # 16x16
    'block_13_expand_relu',  # 8x8
    'block_16_project',      # 4x4
]
base_model_outputs = [base_model.get_layer(name).output for name in layer_names]

# Create the feature extraction model
down_stack = Model(inputs=base_model.input, outputs=base_model_outputs)

down_stack.trainable = False

up_stack = [
    pix2pix.upsample(512, 3),  # 4x4 -> 8x8
    pix2pix.upsample(256, 3),  # 8x8 -> 16x16
    pix2pix.upsample(128, 3),  # 16x16 -> 32x32
    pix2pix.upsample(64, 3),   # 32x32 -> 64x64
]

def unet_model(output_channels:int):
  inputs = layers.Input(shape=[128, 128, 3])

  # Downsampling through the model
  skips = down_stack(inputs)
  x = skips[-1]
  skips = reversed(skips[:-1])

  # Upsampling and establishing the skip connections
  for up, skip in zip(up_stack, skips):
    x = up(x)
    concat = layers.Concatenate()
    x = concat([x, skip])

  # This is the last layer of the model
  last = layers.Conv2DTranspose(
      filters=output_channels, kernel_size=3, strides=2,
      padding='same')  #64x64 -> 128x128

  x = last(x)

  return Model(inputs=inputs, outputs=x)

OUTPUT_CLASSES = 3

model = unet_model(output_channels=OUTPUT_CLASSES)

model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

When I change OUTPUT_CLASSES to 2 it gives me an error:

W tensorflow/core/kernels/data/generator_dataset_op.cc:108] Error occurred when finalizing GeneratorDataset iterator: FAILED_PRECONDITION: Python interpreter state is not initialized. The process may be terminated.

When OUTPUT_CLASSES is 1 the predicted mask is empty.

Maybe something else must be changed? I'm not into NN architecture yet, so I may not see something obvious.

EDIT:

I have added activation='sigmoid' to the output layer

  last = tf.keras.layers.Conv2DTranspose(
      filters=output_channels, kernel_size=3, strides=2,
      padding='same', activation='sigmoid')  #64x64 -> 128x128

  x = last(x)

and OUTPUT_CLASSES = 1

the weird behavior is the next: expected mask is when I train it on a very small dataset (picture and mask from testing included in this dataset, just for testing how it detects seen image), I'm getting something on the first epoch. But the more epoch the worse result. However, the accuracy is ~0.99.

expected mask:
expected mask

predicted mask epoch 0:
predicted mask epoch 0:

If you open the image you may see a slight shadow on the expected mask part.

predicted mask epoch 1:
enter image description here
...

epoch 4:
enter image description here

so it's getting worse with each iteration.

The dataset includes images that should show no mask. Maybe that's the issue? (Edit: excluded data without masks from dataset -- did not help)

EDIT 2:

x = tf.keras.layers.BatchNormalization()(x) 

helped, not perfect but something


Solution

  • I found a solution --- the batch normalization layer before the last layer (Conv2DTranspose)

    x = keras.layers.BatchNormalization()(x)