Search code examples
matlabcomputer-visionmatlab-cvst

Intersection over union but replacing the union with the minimum area in MATLAB


I'm trying to find intersection over union over 2 overlapping images and it works fine.

iou = area of overlap / area of union

Now, when I was referring this document - https://www.mathworks.com/help/vision/ref/bboxoverlapratio.html#expand_panel_heading_input_argument_d119e109624

I saw there are 2 operations that we would use as part of the denominator

  1. area of overlap / area of union
  2. area of overlap / minimum area between the two

When is the min function is helpful?


Solution

  • The minimum is usually done when you want to find how much overlap there is with respect to one bounding box. The union combines information of both bounding boxes together in the final calculation measure. To use the minimum, what this means is that you assume that there is a source bounding box you want to compare to and you want to see how much overlap there is from the estimated bounding box and this source bounding box. Think of the smallest as the upper bound in terms of accuracy. If we got a high IOU value with the smallest of the two boxes, this means that this is the best possible overlap we would achieve with respect to the source. If we were to choose the larger bounding box as the source, then the measure would decrease due to the denominator increasing.

    As another perspective, the minimum version would be used if you know that the coordinates of where the bounding box that is localized should start at the same location with respect to the source bounding box. If you think through that definition, then comparing with a stationary bounding box makes sense.

    To give a better perspective, here's an image from the link you referenced:

    The standard IOU formula is seen by dividing by the union of the two boxes so we know that already. With the minimum formulation, we can visually see that bboxB is the smaller of the two boxes in area, so we are determining how much overlap bboxA has with bboxB where bboxB is assumed to be the source box we're looking at. Once you compute this value, this is the upper bound in terms of accuracy where if you choose bboxA as the source, the similarity would decrease.