I'm trying to get the pixel coordinates of the bounding boxes from the person class(labeled as: mscoco_label_map.pbtxt
item {
name: "/m/01g317"
id: 1
display_name: "person"
}
Currently I'm getting the bounding box and label onto the image via
input_tensor = tf.convert_to_tensor(np.expand_dims(image_np, 0), dtype=tf.float32)
detections, predictions_dict, shapes = detect_fn(input_tensor)
label_id_offset = 1
image_np_with_detections = image_np.copy()
viz_utils.visualize_boxes_and_labels_on_image_array(
image_np_with_detections,
detections['detection_boxes'][0].numpy(),
(detections['detection_classes'][0].numpy() + label_id_offset).astype(int),
detections['detection_scores'][0].numpy(),
category_index,
use_normalized_coordinates=True,
max_boxes_to_draw=3,
min_score_thresh=.30,
agnostic_mode=False)
(All code is inside while loop)
But when I print out detections['detection_boxes'] I get so many normalized coordinates and I don't know how to sort those coordinates to the specic box, eg. the person label.
So how can I get the pixel coordinates of a specific bounding box in detections['detection_boxes']?
New to StackOverflow, so any tips are greatly appreciated.
So detection_boxes
should be an N by 4 array of bounding box co-ordinates in the form [ymin, xmin, ymax, xmax]
in normalised co-ordinates, and detection_classes
should be an array of (`float?) numeric class labels. I'm assuming they haven't changed the API much because I haven't used the object detection API since early last year.
You should be able to do something like this to convert to pixel co-ordinates and then get just on set of labels.
detection_boxes = detections['detection_boxes'][0].numpy()
detection_classes = detections['detection_classes'][0].numpy().astype(int) + label_id_offset
detection_scores = detections['detection_scores'][0].numpy()
# Scale to pixel co-ordinates
detection_boxes[:, (0, 2)] *= IMAGE_HEIGHT
detection_boxes[:, (1, 3)] *= IMAGE_WIDTH
# Select person boxes
cond = (detection_classes == PERSON_CLASS_ID) & (detection_scores >= SCORE_THRESH)
person_boxes = detection_boxes[cond, :]
person_boxes = np.round(person_boxes).astype(int)