Search code examples
pythontensorflowkerastensorflow2.0object-detection

how to resize ground truth boxes in fast-rcnn


fast rcnn is an algorithm for object detection in images, in which we feed to neural network an image and it output for us a list of objects and its categories within the image based on list of bounding boxes called "ground truth boxes". the algorithm compare the ground truth boxes with the boxes generated by the fast-rcnn algorithm and only keep those that sufficiently overlapped with the gt boxes. the problem here that we must resize the image to be fed into CNN, my question is, should us resize also the ground truth boxes before the comparaison step, and how to do that? tanks to reply.


Solution

  • If the bounding boxes are relative, you don't need to change them because 0.2 of the old height is the same as 0.2 of the new height and so on.

    import tensorflow as tf
    import matplotlib.pyplot as plt
    import numpy as np
    from sklearn.datasets import load_sample_image
    
    china = load_sample_image('china.jpg')
    
    relative_boxes = [0.2, 0.2, 0.8, .8]
    
    original = tf.image.draw_bounding_boxes(
        tf.image.convert_image_dtype(tf.expand_dims(china, axis=0), tf.float32),
        np.array(relative_boxes).reshape([1, 1, 4]),
        [[1., 0., 1.], [1., 1., 1.]]
    )
    
    plt.imshow(tf.squeeze(original))
    plt.show()
    

    enter image description here

    
    resized = tf.image.draw_bounding_boxes(
        tf.divide(
            tf.expand_dims(
                tf.image.resize(china, (200, 200)), axis=0),
            255),
        np.array(relative_boxes).reshape([1, 1, 4]),
        [[1., 0., 1.], [1., 1., 1.]]
    )
    
    plt.imshow(tf.squeeze(resized))
    plt.show()
    

    enter image description here