Search code examples
pythonpytorchyoloyolov7

what are keypoints in yolov7 pose?


I am trying to understad the keypoint output of the yolov7, but I didn't find enough information about that.

I have the following output:

array([          0,           0,      430.44,      476.19,      243.75,         840,     0.94348,      402.75,       128.5,     0.99902,       417.5,      114.25,     0.99658,       385.5,         115,     0.99609,      437.75,       125.5,     0.89209,      366.75,         128,     0.66406,         471,      229.62,
           0.97754,      346.75,      224.88,     0.97705,         526,      322.75,     0.95654,       388.5,      340.75,     0.95898,       424.5,      314.75,     0.94873,       483.5,       335.5,      0.9502,       465.5,      457.75,     0.99219,       381.5,      456.25,     0.99219,       451.5,         649,
           0.98584,      379.25,       649.5,     0.98633,       446.5,         818,     0.92285,         366,       829.5,      0.9248])

the paper https://arxiv.org/pdf/2204.06806.pdf tells "So, in total there are 51 elements for 17 keypoints associated with an anchor. " but the length is 58.

there are 18 numbers that probably are confidences of a keypoint:

array([     0.94348,     0.99902,,     0.99658,     0.99609,     0.89209,     0.66406, 0.97754,     0.97705,     0.95654,     0.95898,     0.94873, 0.9502,     0.99219,     0.99219,
           0.98584,     0.98633,     0.92285,     0.9248])

But the paper tells that are 17 keypoints.

In this repo https://github.com/retkowsky/Human_pose_estimation_with_YoloV7/blob/main/Human_pose_estimation_YoloV7.ipynb tells that the keypoints are the following:

enter image description here

but that shape doesn't match the prediction:

enter image description here

Is the first image right about the keypoints?

and what are the first four digits?

  0,           0,      430.44,      476.19

Thanks

EDIT

This is not a complet answer but editing the plot function I can get the following information

Given the following output keypoint:

array([[          0,           0,      312.31,         486,      291.75,       916.5,     0.94974,       304.5,      118.75,     0.99902,      320.75,      102.25,     0.99756,      287.75,      103.25,     0.99658,         345,         112,     0.96338,      268.25,      115.25,     0.69531,         394,
             226.25,     0.98145,      228.25,      230.12,     0.98389,       428.5,       358.5,     0.95898,      192.88,      364.75,     0.96533,         407,      464.25,     0.95166,      215.75,      464.25,      0.9585,      363.75,         491,     0.99219,      257.75,       491.5,     0.99268,
              361.5,         680,      0.9834,      250.88,         679,     0.98438,         361,       861.5,     0.91064,         247,         863,     0.91504]])

from this position ouput[7:] you can get the points of each keypoint, with the following sort as you can see in the image

enter image description here

array([      304.5,      118.75,     0.99902,      320.75,      102.25,     0.99756,      287.75,      103.25,     0.99658,         345,         112,     0.96338,      268.25,      115.25,     0.69531,         394,      226.25,     0.98145,      228.25,      230.12,     0.98389,       428.5,       358.5,     0.95898,
            192.88,      364.75,     0.96533,         407,      464.25,     0.95166,      215.75,      464.25,      0.9585,      363.75,         491,     0.99219,      257.75,       491.5,     0.99268,       361.5,         680,      0.9834,      250.88,         679,     0.98438,         361,       861.5,     0.91064,
               247,         863,     0.91504])

but I am not sure about what are the rest of the values:

0, 0, 312.31, 486, 291.75, 916.5, 0.94974,


Solution

  • I assume you have passed your output through output_to_keypoint function in utils.plots.

    Based on the comment left by the authors of that function, the first 7 values should be (in order):

    • batch_id
    • class_id
    • x coordinate of the center of the bounding box
    • y coordinate of the center of the bounding box
    • w - width of the bounding box
    • h - height of the bounding box
    • conf - confidence in the bounding box