Search code examples
pythonmathgeometrymediapipeobject-tracking

Can we get the orientation of the hand from mediapipe's palm detector?


Is there's a method to get the orientation of the hand from mediapipe's palm detector? is something like this possible?

The model outputs the 3D coordinates of 21 landmarks per hand, there must be a way to do this using the third z-axis, but I have no idea how to do it.


Solution

    1. choose three landmarks that are co-planar (I chose 0, 5 and 17) - ideally of the palm, this way you'll get the orientation of the palm

    2. convert them to a numpy array of shape [3,3]:

      points = np.asarray([world_landmarks[0], world_landmarks[5], world_landmarks[17]])

    3. define two vectors based on those three points (e.g from 0 to 2 and from 2 to 1) - those vectors then will be in the plane of the hand

    4. to get the orientation of the hand you want to get a vector which is perpendicular to both of those - this vector will always point in the direction in which the hand (or rather palm) is pointing (note: the direction of this vector for the left hand will point in the opposite direction than the one of the right hand because they are mirrored). To get this vector you need to calculate the vector or cross product

    This line of code does steps 3 & 4:

    normal_vector = np.cross(points[2] - points[0], points[1] - points[2])
    

    Finally, you can normalise this vector so it has alway length 1, like so:

    normal_vector /= np.linalg.norm(normal_vector)