Search code examples
androidandroid-cameraxfirebase-mlkitgoogle-mlkitpose-detection

ML Kit - pose graphic overlay Compose implementation not working as intended


I'm trying to implement JetPack Compose version of ML Kit pose detection using ML Kit Vision Quickstart Sample App

My issue is that the pose graphic overlay lines are flipped when the camera is set to LENS_FACING_FRONT.

When the camera is set toLENS_FACING_BACK, the graphic overlay works fine.

enter image description here

But when the camera is set to LENS_FACING_FRONT, the graphic overlay is horizontally flipped: enter image description here

The ML Kit Vision Quickstart Sample App handles the reflection using transformationMatrix: https://github.com/googlesamples/mlkit/blob/master/android/vision-quickstart/app/src/main/java/com/google/mlkit/vision/demo/GraphicOverlay.java#L293:

transformationMatrix.reset();
transformationMatrix.setScale(scaleFactor, scaleFactor);
transformationMatrix.postTranslate(-postScaleWidthOffset, -postScaleHeightOffset);

if (isImageFlipped) {
    transformationMatrix.postScale(-1f, 1f, getWidth() / 2f, getHeight() / 2f);
}

How can I do something similar in Jetpack Compose?

This is my translateX function that translates the image's coordinate system to the view's coordinate system. I think I need to modify it to handle coordinate translation when the image is flipped, but I don't know how:

fun translateX(
    x: Float,
    scaleFactor: Float,
    postScaleWidthOffset: Float
): Float {
    // Adjusts the supplied value from the image scale to the view scale
    val scale = x * scaleFactor
    return scale - postScaleWidthOffset
}

This is the Compose project I made if you want to take a look: ComposeMLKit


Solution

  • I asked this question because:

    1. I was confused why the matrix transformation  is needed 
    2. I was confused why the translation is needed when the overlay is drawn on the canvas
    

    Here's the soln:

    #1. Matrix transformation

    The transformation matrix is needed to flip the camera image preview. if you don't implement this, then the camera preview won't be mirroring you [if you put your phone on selfie mode, and if you raise your right hand, the preview will show your hand on the opposite side].

    The easiest way I solved this is by using Google's MlKitAnalyzer. As the documentation states, it will handle the camera preview matrix transformation for you:

    This class handles the coordinate transformation between ML Kit output and the target coordinate system. Using the targetCoordinateSystem set in the constructor, it calculates the Matrix with the value provided by CameraX via updateTransform and forwards it to the ML Kit Detector. The coordinates returned by MLKit will be in the specified coordinate system.

    You can implement your own soln similar to GraphicOverlay.java if you don't want to use Google's MlKitAnalyzer

    #2. Canvas translation

    This is needed to adjust Camera Preview image coordinate system to the view's coordinate system.

    The reason you do this is because you probably need to use UI modifiers to make the PreviewView fill up the phone's screen [Modifier.fillMaxSize() for Jetpack Compose], and you want the graphic overlay to match up with the height and width modification you made

    You do implement this similar to GraphicOverlay.java.translateX function