Search code examples
opencvunity-game-enginecalibration

OpenCV for Unity : 4-point calibration/reprojection


It is my first post on Stack so I'm sorry in advance for my clumsiness. Please let me know if I can improve my question anyway.

► What I want to achieve (in a long term):

I try to manipulate my Unity3d presentation with a laser pointer using OpenCV fo Unity.

I believe one picture is worth more than a thousand words, so this should tell the most:

enter image description here

► What is the problem:

I try to make a simple 4-point calibration (projection) from camera view (some kind of trapezium) into plane space.

I thought it will be something very basic and easy, but I have no experience with OpenCV and I can't make it work.

► Sample:

I made a much less complicated example, without any laser detection and all other stuff. Only 4-points trapezium that I try to reproject into the plane space.

Link to the whole sample project: https://1drv.ms/u/s!AiDsGecSyzmuujXGQUapcYrIvP7b

The core script from my example:

using OpenCVForUnity;
using System.Collections;
using System.Collections.Generic;
using UnityEngine;
using UnityEngine.UI;
using System;

public class TestCalib : MonoBehaviour
{
    public RawImage displayDummy;
    public RectTransform[] handlers;
    public RectTransform dummyCross;
    public RectTransform dummyResult;

    public Vector2 webcamSize = new Vector2(640, 480);
    public Vector2 objectSize = new Vector2(1024, 768);

    private Texture2D texture;

    Mat cameraMatrix;
    MatOfDouble distCoeffs;

    MatOfPoint3f objectPoints;
    MatOfPoint2f imagePoints;

    Mat rvec;
    Mat tvec;
    Mat rotationMatrix;
    Mat imgMat;


    void Start()
    {
        texture = new Texture2D((int)webcamSize.x, (int)webcamSize.y, TextureFormat.RGB24, false);
        if (displayDummy) displayDummy.texture = texture;
        imgMat = new Mat(texture.height, texture.width, CvType.CV_8UC3);
    }


    void Update()
    {
        imgMat = new Mat(texture.height, texture.width, CvType.CV_8UC3);
        Test();
        DrawImagePoints();
        Utils.matToTexture2D(imgMat, texture);
    }

    void DrawImagePoints()
    {
        Point[] pointsArray = imagePoints.toArray();
        for (int i = 0; i < pointsArray.Length; i++)
        {
            Point p0 = pointsArray[i];
            int j = (i < pointsArray.Length - 1) ? i + 1 : 0;
            Point p1 = pointsArray[j];

            Imgproc.circle(imgMat, p0, 5, new Scalar(0, 255, 0, 150), 1);
            Imgproc.line(imgMat, p0, p1, new Scalar(255, 255, 0, 150), 1);
        }
    }


    private void DrawResults(MatOfPoint2f resultPoints)
    {
        Point[] pointsArray = resultPoints.toArray();
        for (int i = 0; i < pointsArray.Length; i++)
        {
            Point p = pointsArray[i];
            Imgproc.circle(imgMat, p, 5, new Scalar(255, 155, 0, 150), 1);
        }
    }

    public void Test()
    {
        float w2 = objectSize.x / 2F;
        float h2 = objectSize.y / 2F;

        /*
        objectPoints = new MatOfPoint3f(
            new Point3(-w2, -h2, 0),
            new Point3(w2, -h2, 0),
            new Point3(-w2, h2, 0),
            new Point3(w2, h2, 0)
        );
        */

        objectPoints = new MatOfPoint3f(
            new Point3(0, 0, 0),
            new Point3(objectSize.x, 0, 0),
            new Point3(objectSize.x, objectSize.y, 0),
            new Point3(0, objectSize.y, 0)
        );

        imagePoints = GetImagePointsFromHandlers();

        rvec = new Mat(1, 3, CvType.CV_64FC1);
        tvec = new Mat(1, 3, CvType.CV_64FC1);
        rotationMatrix = new Mat(3, 3, CvType.CV_64FC1);


        double fx = webcamSize.x / objectSize.x;
        double fy = webcamSize.y / objectSize.y;
        double cx = 0; // webcamSize.x / 2.0f;
        double cy = 0; // webcamSize.y / 2.0f;
        cameraMatrix = new Mat(3, 3, CvType.CV_64FC1);
        cameraMatrix.put(0, 0, fx);
        cameraMatrix.put(0, 1, 0);
        cameraMatrix.put(0, 2, cx);
        cameraMatrix.put(1, 0, 0);
        cameraMatrix.put(1, 1, fy);
        cameraMatrix.put(1, 2, cy);
        cameraMatrix.put(2, 0, 0);
        cameraMatrix.put(2, 1, 0);
        cameraMatrix.put(2, 2, 1.0f);

        distCoeffs = new MatOfDouble(0, 0, 0, 0);
        Calib3d.solvePnP(objectPoints, imagePoints, cameraMatrix, distCoeffs, rvec, tvec);

        Mat uv = new Mat(3, 1, CvType.CV_64FC1);
        uv.put(0, 0, dummyCross.anchoredPosition.x);
        uv.put(1, 0, dummyCross.anchoredPosition.y);
        uv.put(2, 0, 0);

        Calib3d.Rodrigues(rvec, rotationMatrix);
        Mat P = rotationMatrix.inv() * (cameraMatrix.inv() * uv - tvec);

        Vector2 v = new Vector2((float)P.get(0, 0)[0], (float)P.get(1, 0)[0]);
        dummyResult.anchoredPosition = v;
    }

    private MatOfPoint2f GetImagePointsFromHandlers()
    {
        MatOfPoint2f m = new MatOfPoint2f();
        List<Point> points = new List<Point>();
        foreach (RectTransform handler in handlers)
        {
            Point p = new Point(handler.anchoredPosition.x, handler.anchoredPosition.y);
            points.Add(p);
        }

        m.fromList(points);
        return m;
    }
}

Thanks in advance for any help.


Solution

  • This question is not opencv specific but heavily math-based and more often seen in the realm of computer graphics. What you are looking for is called a Projective Transformation.

    A Projective Transformation takes a set of coordinates and projects them onto something. In your case you want to project a 2D point in the camera view to a 2D point on a flat plane.

    So we want a projection transform for 2D-Space. To perform a projection transform we need to find the projection matrix for the transformation we want to apply. In this case we need a matrix that expresses the projective deformation of the camera in relation to a flat plane.

    To work with projections we first need to convert our points into homogeneous coordinates. To do so we simply add a new component to our vectors with value 1. So (x,y) becomes (x,y,1). And we will do that with all our five available points.

    Now we start with the actual math. First some definitions: The camera's point of view and respective coordinates shall be the camera space, coordinates in relation to a flat plane are in flat space. Let c₁ to c₄ be the corner points of the plane in relation to camera space as homogeneous vectors. Let p be the point that we have found in camera space and p' the point we want to find in flat space, both as homogeneous vectors again.

    Mathematically speaking, we are looking for a Matrix C that will allow us to calculate p' by giving it p.

    p' = C * p
    

    Now we obviously need to find C. To find a projection matrix for two dimensional space, we need four points (how convenient..) I will assume that c₁ will go to (0,0), c₂ will go to (0,1), c₃ to (1,0) and c₄ to (1,1). You need to solve two matrix equations using e.g. the gaussian row elimination or an LR Decomposition algorithm. OpenCV should contain functions to do those tasks for you, but be aware of matrix conditioning and their impact on a usable solution.

    Now back to the matrices. You need to calculate two basis change matrices as they are called. They are used to change the frame of reference of your coordinates (exactly what we want to do). The first matrix will transform our coordinates to three dimensional basis vectors and the second one will transform our 2D plane into three dimensional basis vectors.

    For the coordinate one you'll need to calculate λ, μ and r in the following equation:

    ⌈ c₁.x   c₂.x   c₃.x ⌉     ⌈ λ ⌉    ⌈ c₄.x ⌉
      c₁.y   c₂.y   c₃.y   *    μ   =   c₄.y
    ⌊   1      1      1  ⌋     ⌊ r ⌋    ⌊  1   ⌋
    

    this will lead you to your first Matrix, A

        ⌈ λ*c₁.x   μ*c₂.x   r*c₃.x ⌉
    A =   λ*c₁.y   μ*c₂.y   r*c₃.y 
        ⌊   λ         μ        r   ⌋
    

    A will now map the points c₁ to c₄ to the basis coordinates (1,0,0), (0,1,0), (0,0,1) and (1,1,1). We do the same thing for our plane now. First solve

    ⌈ 0 0 1 ⌉     ⌈ λ ⌉    ⌈ 1 ⌉
      0 1 0   *    μ   =   1
    ⌊ 1 1 1 ⌋     ⌊ r ⌋    ⌊ 1 ⌋
    

    and get B

        ⌈ 0 0 r ⌉
    B =   0 μ 0 
        ⌊ λ μ r ⌋
    

    A and B will now map from those three dimensional basis vectors into your respective spaces. But that is not quite what we want. We want camera space -> basis -> flat space, so only matrix B manipulates in the right direction. But that is easily fixable by inverting A. That will give us matrix C = B * A⁻¹ (watch the order of B and A⁻¹ it is not interchangeable). This leaves us with a formula to calculate p' out of p.

    p' = C * p
    p' = B * A⁻¹ * p
    

    Read it from left to right like: take p, transform p from camera space into basis vectors and transform those into flat space.

    If you remember correctly, p' still has three components, so we need to dehomogenize p' first before we can use it. This will yield

    x' = p'.x / p'.z
    y' = p'.y / p'.z
    

    and viola we have successfully transformed a laser point from a camera view onto a flat piece of paper. Totally not overly complicated or so...