tensorflow computer-vision semantic-segmentation deeplab

Why is the deeplab v3+ model confused about pixels outside image boundary?

I'm using the google research github repository to run deeplab v3+ on my dataset to segment parts of a car. The crop size I've used is 513,513 (default) and the code adds a boundary to images smaller than that size (correct me if I'm wrong).

example!

The model seems to be performing poorly on the added boundary. Is there something I'm supposed to correct or will the model do fine with more training ?

Update: Here's the tensorboard graphs for training. Why is the regularization loss shooting like that? The output seems to be improving, can someone help me making inferences from these graphs?

Solution

Is there something I'm supposed to correct or will the model do fine with more training ?

its Ok, don't mind the boundary

To inference you can use this code

import cv2
import tensorflow as tf
import numpy as np
from PIL import Image

from skimage.transform import resize



class DeepLabModel():
    """Class to load deeplab model and run inference."""

    INPUT_TENSOR_NAME = 'ImageTensor:0'
    OUTPUT_TENSOR_NAME = 'SemanticPredictions:0'
    INPUT_SIZE = 513


    def __init__(self, path):
        """Creates and loads pretrained deeplab model."""
        self.graph = tf.Graph()

        graph_def = None
        # Extract frozen graph from tar archive.

        with tf.gfile.GFile(path, 'rb')as file_handle:
            graph_def = tf.GraphDef.FromString(file_handle.read())

        if graph_def is None:
            raise RuntimeError('Cannot find inference graph')

        with self.graph.as_default():
            tf.import_graph_def(graph_def, name='')

        self.sess = tf.Session(graph=self.graph)


    def run(self, image):
        """Runs inference on a single image.

        Args:
          image: A PIL.Image object, raw input image.

        Returns:
          seg_map: np.array. values of pixels are classes
        """

        width, height = image.size

        resize_ratio = 1.0 * self.INPUT_SIZE / max(width, height)
        target_size = (int(resize_ratio * width), int(resize_ratio * height))

        resized_image = image.convert('RGB').resize(target_size, Image.ANTIALIAS)

        batch_seg_map = self.sess.run(
           self.OUTPUT_TENSOR_NAME,
           feed_dict={self.INPUT_TENSOR_NAME: [np.asarray(resized_image)]})

        seg_map = batch_seg_map[0]        

        seg_map = resize(seg_map.astype(np.uint8), (height, width), preserve_range=True, order=0, anti_aliasing=False)


        return seg_map

the code is based on this file https://github.com/tensorflow/models/blob/master/research/deeplab/deeplab_demo.ipynb


model = DeepLabModel(your_model_pb_path)

img = Image.open(img_path)

seg_map = model.run(img)

to get your_model_pb_path you need to export your model to .pb file you can do it using export_model.py file in Deeplab repo https://github.com/tensorflow/models/blob/master/research/deeplab/export_model.py

if you were training xception_65 version

python3 <path to your deeplab folder>/export_model.py \
  --logtostderr \
  --checkpoint_path=<your ckpt> \
  --export_path="./my_model.pb" \
  --model_variant="xception_65" \
  --atrous_rates=6 \
  --atrous_rates=12 \
  --atrous_rates=18 \
  --output_stride=16 \
  --decoder_output_stride=4 \
  --num_classes=<NUMBER OF YOUR CLASSES> \
  --crop_size=513 \
  --crop_size=513 \
  --inference_scales=1.0

<your ckpt> is a path to your trained model checkpoint you can find checkpoints in the folder that you passed as argument --train_logdir when training

you need to include only model name and number of iterations in path, or in other words you will have in your training folder, for example, files model-1500.meta, model-1500.index and model-1000.data-00000-of-00001 you need to discard everything that goes after ., so the ckpt path will be model-1000

please make sure that atrous_rates are the same as you used to train the model

if you were training mobilenet_v2 version

python3 <path to your deeplab folder>/export_model.py \
  --logtostderr \
  --checkpoint_path=<your ckpt> \
  --export_path="./my_model.pb" \
  --model_variant="mobilenet_v2" \
  --num_classes=<NUMBER OF YOUR CLASSES>  \
  --crop_size=513 \
  --crop_size=513 \
  --inference_scales=1.0

more you can find here https://github.com/tensorflow/models/blob/master/research/deeplab/local_test_mobilenetv2.sh https://github.com/tensorflow/models/blob/master/research/deeplab/local_test.sh

You can visualize results using this code

img_arr = np.array(img)

# as may colors as you have classes
colors = [(255, 0, 0), (0, 255, 0), ...] 

for c in range(0, N_CLASSES):
    img_arr[seg_map == c] = 0.5 * img_arr[seg_map == c] + 0.5 * np.array(colors[c])

cv2.imshow(img_arr)
cv2.waitKey(0)