Search code examples
tensorflowcomputer-visionsemantic-segmentationdeeplab

Why is the deeplab v3+ model confused about pixels outside image boundary?


I'm using the google research github repository to run deeplab v3+ on my dataset to segment parts of a car. The crop size I've used is 513,513 (default) and the code adds a boundary to images smaller than that size (correct me if I'm wrong).

example!

The model seems to be performing poorly on the added boundary. Is there something I'm supposed to correct or will the model do fine with more training ?

Update: Here's the tensorboard graphs for training. Why is the regularization loss shooting like that? The output seems to be improving, can someone help me making inferences from these graphs?


Solution

  • Is there something I'm supposed to correct or will the model do fine with more training ?

    its Ok, don't mind the boundary


    To inference you can use this code

    import cv2
    import tensorflow as tf
    import numpy as np
    from PIL import Image
    
    from skimage.transform import resize
    
    
    
    class DeepLabModel():
        """Class to load deeplab model and run inference."""
    
        INPUT_TENSOR_NAME = 'ImageTensor:0'
        OUTPUT_TENSOR_NAME = 'SemanticPredictions:0'
        INPUT_SIZE = 513
    
    
        def __init__(self, path):
            """Creates and loads pretrained deeplab model."""
            self.graph = tf.Graph()
    
            graph_def = None
            # Extract frozen graph from tar archive.
    
            with tf.gfile.GFile(path, 'rb')as file_handle:
                graph_def = tf.GraphDef.FromString(file_handle.read())
    
            if graph_def is None:
                raise RuntimeError('Cannot find inference graph')
    
            with self.graph.as_default():
                tf.import_graph_def(graph_def, name='')
    
            self.sess = tf.Session(graph=self.graph)
    
    
        def run(self, image):
            """Runs inference on a single image.
    
            Args:
              image: A PIL.Image object, raw input image.
    
            Returns:
              seg_map: np.array. values of pixels are classes
            """
    
            width, height = image.size
    
            resize_ratio = 1.0 * self.INPUT_SIZE / max(width, height)
            target_size = (int(resize_ratio * width), int(resize_ratio * height))
    
            resized_image = image.convert('RGB').resize(target_size, Image.ANTIALIAS)
    
            batch_seg_map = self.sess.run(
               self.OUTPUT_TENSOR_NAME,
               feed_dict={self.INPUT_TENSOR_NAME: [np.asarray(resized_image)]})
    
            seg_map = batch_seg_map[0]        
    
            seg_map = resize(seg_map.astype(np.uint8), (height, width), preserve_range=True, order=0, anti_aliasing=False)
    
    
            return seg_map
    

    the code is based on this file https://github.com/tensorflow/models/blob/master/research/deeplab/deeplab_demo.ipynb

    
    model = DeepLabModel(your_model_pb_path)
    
    img = Image.open(img_path)
    
    seg_map = model.run(img)
    
    
    

    to get your_model_pb_path you need to export your model to .pb file you can do it using export_model.py file in Deeplab repo https://github.com/tensorflow/models/blob/master/research/deeplab/export_model.py

    if you were training xception_65 version

    python3 <path to your deeplab folder>/export_model.py \
      --logtostderr \
      --checkpoint_path=<your ckpt> \
      --export_path="./my_model.pb" \
      --model_variant="xception_65" \
      --atrous_rates=6 \
      --atrous_rates=12 \
      --atrous_rates=18 \
      --output_stride=16 \
      --decoder_output_stride=4 \
      --num_classes=<NUMBER OF YOUR CLASSES> \
      --crop_size=513 \
      --crop_size=513 \
      --inference_scales=1.0 
    

    <your ckpt> is a path to your trained model checkpoint you can find checkpoints in the folder that you passed as argument --train_logdir when training

    you need to include only model name and number of iterations in path, or in other words you will have in your training folder, for example, files model-1500.meta, model-1500.index and model-1000.data-00000-of-00001 you need to discard everything that goes after ., so the ckpt path will be model-1000

    please make sure that atrous_rates are the same as you used to train the model

    if you were training mobilenet_v2 version

    python3 <path to your deeplab folder>/export_model.py \
      --logtostderr \
      --checkpoint_path=<your ckpt> \
      --export_path="./my_model.pb" \
      --model_variant="mobilenet_v2" \
      --num_classes=<NUMBER OF YOUR CLASSES>  \
      --crop_size=513 \
      --crop_size=513 \
      --inference_scales=1.0
    

    more you can find here https://github.com/tensorflow/models/blob/master/research/deeplab/local_test_mobilenetv2.sh https://github.com/tensorflow/models/blob/master/research/deeplab/local_test.sh


    You can visualize results using this code

    img_arr = np.array(img)
    
    # as may colors as you have classes
    colors = [(255, 0, 0), (0, 255, 0), ...] 
    
    for c in range(0, N_CLASSES):
        img_arr[seg_map == c] = 0.5 * img_arr[seg_map == c] + 0.5 * np.array(colors[c])
    
    cv2.imshow(img_arr)
    cv2.waitKey(0)