python pytorch conv-neural-network object-detection faster-rcnn

RuntimeError: The size of tensor a (10) must match the size of tensor b (3) at non-singleton dimension 0

I am using this intersection over union code to determine IOU from my predictions and targets:

def intersection_over_union(boxes_preds, boxes_labels):
    """
    Calculates intersection over union
    Parameters:
        boxes_preds (tensor): Predictions of Bounding Boxes (BATCH_SIZE, 4)
        boxes_labels (tensor): Correct labels of Bounding Boxes (BATCH_SIZE, 4)
        box_format (str): midpoint/corners, if boxes (x,y,w,h) or (x1,y1,x2,y2)
    Returns:
        tensor: Intersection over union for all examples
    """


    box1_x1 = boxes_preds[..., 0:1]
    box1_y1 = boxes_preds[..., 1:2]
    box1_x2 = boxes_preds[..., 2:3]
    box1_y2 = boxes_preds[..., 3:4]  # (N, 1)
    box2_x1 = boxes_labels[..., 0:1]
    box2_y1 = boxes_labels[..., 1:2]
    box2_x2 = boxes_labels[..., 2:3]
    box2_y2 = boxes_labels[..., 3:4]

    x1 = torch.max(box1_x1, box2_x1)
    y1 = torch.max(box1_y1, box2_y1)
    x2 = torch.min(box1_x2, box2_x2)
    y2 = torch.min(box1_y2, box2_y2)

    # .clamp(0) is for the case when they do not intersect
    intersection = (x2 - x1).clamp(0) * (y2 - y1).clamp(0)

    box1_area = abs((box1_x2 - box1_x1) * (box1_y2 - box1_y1))
    box2_area = abs((box2_x2 - box2_x1) * (box2_y2 - box2_y1))

    return intersection / (box1_area + box2_area - intersection + 1e-6)

My inputs looks like this:

My target bounding boxes look like this: print(targets[0]['boxes'])

tensor([[217., 481., 249., 511.],
        [435., 191., 467., 223.],
        [471.,  86., 503., 118.]])

And my prediction bounding boxes look like this: predictions['boxes']

tensor([[ 29.7859, 354.9666,  63.0900, 387.6363],
        [469.1072,  85.6840, 503.1974, 119.7137],
        [ 89.3957, 314.1584, 123.9789, 347.1621],
        [432.2971, 188.4454, 468.4712, 227.3808],
        [214.5407, 482.0136, 248.7030, 512.0000],
        [329.1979, 340.8802, 366.3720, 375.8683],
        [298.5089,  99.0098, 334.4280, 129.4205],
        [  0.0000, 347.7724,  17.3409, 384.5709],
        [485.4312, 181.3882, 512.0000, 213.2009],
        [144.5959, 356.5197, 183.4489, 387.4958]])

However, when I apply the IOU function:

iou = intersection_over_union(predictions['boxes'], targets[0]['boxes'])

I get this error:

RuntimeError: The size of tensor a (10) must match the size of tensor b (3) at non-singleton dimension 0

I'm not sure how I can fix the function as I'm guessing this means I have more predictions than targets...

Solution

I've instead opted for torchvisions IOU example:

iou = torchvision.ops.box_iou(predictions['boxes'], targets[0]['boxes'])

There are no errors here.

I alternatively found this implementation (from https://github.com/amdegroot/ssd.pytorch/blob/master/layers/box_utils.py#L48):


def intersect(box_a, box_b):
    """ We resize both tensors to [A,B,2] without new malloc:
    [A,2] -> [A,1,2] -> [A,B,2]
    [B,2] -> [1,B,2] -> [A,B,2]
    Then we compute the area of intersect between box_a and box_b.
    Args:
      box_a: (tensor) bounding boxes, Shape: [A,4].
      box_b: (tensor) bounding boxes, Shape: [B,4].
    Return:
      (tensor) intersection area, Shape: [A,B].
    """
    A = box_a.size(0)
    B = box_b.size(0)
    max_xy = torch.min(box_a[:, 2:].unsqueeze(1).expand(A, B, 2),
                       box_b[:, 2:].unsqueeze(0).expand(A, B, 2))
    min_xy = torch.max(box_a[:, :2].unsqueeze(1).expand(A, B, 2),
                       box_b[:, :2].unsqueeze(0).expand(A, B, 2))
    inter = torch.clamp((max_xy - min_xy), min=0)
    return inter[:, :, 0] * inter[:, :, 1]


def jaccard(box_a, box_b):
    """Compute the jaccard overlap of two sets of boxes.  The jaccard overlap
    is simply the intersection over union of two boxes.  Here we operate on
    ground truth boxes and default boxes.
    E.g.:
        A ∩ B / A ∪ B = A ∩ B / (area(A) + area(B) - A ∩ B)
    Args:
        box_a: (tensor) Ground truth bounding boxes, Shape: [num_objects,4]
        box_b: (tensor) Prior boxes from priorbox layers, Shape: [num_priors,4]
    Return:
        jaccard overlap: (tensor) Shape: [box_a.size(0), box_b.size(0)]
    """
    inter = intersect(box_a, box_b)
    area_a = ((box_a[:, 2]-box_a[:, 0]) *
              (box_a[:, 3]-box_a[:, 1])).unsqueeze(1).expand_as(inter)  # [A,B]
    area_b = ((box_b[:, 2]-box_b[:, 0]) *
              (box_b[:, 3]-box_b[:, 1])).unsqueeze(0).expand_as(inter)  # [A,B]
    union = area_a + area_b - inter
    return inter / union  # [A,B]

Where applying jaccard produces the same output as torchvisions own functions:

iou = jaccard(predictions['boxes'], targets[0]['boxes'])

An example tensor when printing IOU:

tensor([[0.0000, 0.0000],
        [0.0000, 0.0000],
        [0.0000, 0.0000],
        [0.0000, 0.0000],
        [0.0000, 0.9322],
        [0.8021, 0.0000],
        [0.0000, 0.0000],
        [0.0000, 0.0000]])