Search code examples
pythonnumpyopencvmatrixcamera

How to solve a matrix equation for an unknown matrix?


Camera Calibration and 3D reconstruction using opencv

I am trying solving the equation , s m' = A[R|t]M'

i.e

m = K . T . M where m, K, M and last column of T [ R | t ] are known.

I want to obtain the values for each element of the 3*3 rotation matrix. I have.

This question was also answered here

But I could not understand how to get values for 3*3 rotation matrix after making the new set of equations each time, when we take new values for m and M.

m contains the coordinates of the projection point in pixels, I have 16 different points on the image for the pattern captured by the camera and have 16 set of values for each u and v.

m=np.array([u,v,1])

K is my intrinsic matrix/camera matrix/matrix of intrinsic parameters for the camera, I have the value for fx, fy (focal lengths) and cx, cy (principal point) as camera intrinsic matrix

K=np.matrix([ [fx, 0, cx, 0], 
              [ 0, fy, cy, 0], 
              [ 0, 0, 1, 0]])

T is the transformation to pass to the "world" coordinate system to the camera coordinate system ( extrinsic matrix,[ R | t ] ), I also have the values for Tx, Ty and Tz.

T= np.matrix([[x00, x01, x02, Tx], 
              [x10, x11, x12, Ty], 
              [x20, x21, x22, Tz], 
              [0 , 0 , 0 , 1 ]])

M is the homogeneous coordinate of a point in the Cartesian coordinate system "world" i.e the coordinates of a 3D point in the world coordinate space. I have the 16 points from the pattern therefore i have 16 different values for each X, Y, Z.

M=np.array([X,Y,Z,1])

My goal is to get the values for elements x00, x01, x02, x10, x11, x12, x20, x21, x22 for matrix T. could someone please help??

For more clarification:

Suppose for m matrix (the coordinates of the projection point in pixels) the value of u and v are:

u = [ 337, 337, 316, 317, 302, 302, 291, 292, 338, ...]

and

v =[ 487, 572, 477, 547, 470, 528, 465, 516, 598, ...]

i.e the coordinates of the first projection point in pixels are 337 (row number) and 487 (column number)

therefore,

for first set of equation, matrix, m will have values,

import sympy as sy            
import numpy as np


# m = sy.Matrix([u, v, 1]
m = sy.Matrix([337, 487, 1])

,

for second set of equation, matrix, m will have values,

# m = sy.Matrix([u, v, 1]
m = sy.Matrix([337, 572, 1])

and soon...

for K matrix (matrix of intrinsic parameters) the values:

K = sy.Matrix([[711.629,  0, 496.220, 0],
               [0,  712.682, 350.535, 0],
               [0,   0,  0, 1]])

for M matrix ( the coordinates of a 3D points in the world coordinate space) the value for X,Y and Z are:

X = [4.25, 4.25, 5.32, 5.32, 6.27, 6.27, 7.28, 7.28, 4.20, ...] 
Y = 0
Z =  [0.63, 1.63, 0.63, 1.63, 0.59, 1.59, 0.60, 1.92, 2.92, ...]  

for first set of equation, matrix M will be

# M=np.array([X,Y,Z,1])
M = sy.Matrix([0.63, 0, 4.25, 1])

,

for second set of equation, matrix, M will have values,

# M=np.array([X,Y,Z,1])
M = sy.Matrix([1.63, 0, 4.25, 1])

and soon...

for T matrix ( extrinsic matrix, [ R | t ]) we have value for Tx, Ty, Tz as 0, -1.35, 0 .Therefore, T matrix will be:

T = sy.Matrix([[x11, x12, x13, 0],
               [x21, x22, x23, -1.32],
               [x31, x32, x33, 0],
               [0,     0,   0,  1]])

I need to make nine set of these matrix equations: m = K * T * M using different value for m and M so that I could calculate the values for 9 unknowns in T matrix out of these set of equations.


Solution

  • Essentially, you have the matrix equation (using the notation of the OpenCV documentation):

    A @ (R @ w + t) == m
    

    Where A.shape == (3, 3), R.shape == (3, 3), w.shape == (3, n), t.shape == (3, 1), m.shape == (3, n), representing n points in world coordinates w and image coordinates m.

    This equation can be rearranged as

    w.T @ R.T == (inv(A) @ m - t).T
    

    where inv(A) is the inverse of A. The shape of the left- and right-hand sides is (n, 3). This has the format of a matrix equation, with 9 unknowns (in R.T) and n equations. In this form, you can feed into np.linalg.lstsq for a least-squares solution - assuming that you have n >= 3 with sufficiently independent points.

    Here is a demonstration with random numbers:

    import numpy as np
    
    # Setup test case
    np.random.seed(1)
    R = np.random.randint(-9, 9, size=(3, 3)).astype(np.float64)
    t = np.array([1, 1.5, 2]).reshape(3, 1) # column vector
    Rt = np.hstack([R, t]) # shape (3, 4)
    A = np.diag([0.5, 0.5, 1.0]) # shape (3, 3)
    
    n = 20 # number of points
    # M: shape (4, n)
    M = np.vstack([np.random.uniform(size=(3, n)), np.ones((1, n))])
    m = A @ Rt @ M # m.shape == (3, n)
    
    # Now try to reconstruct R, given A, M, t, and m.
    
    w = M[:3, :] # world XYZ coordinates, shape (3, n)
    
    # Matrix equation: A @ (R @ w + t) == m
    # Equivalent to w.T @ R.T == (inv(A) @ m - t).T
    RTfit, _, _, _ = np.linalg.lstsq(w.T, (np.linalg.inv(A) @ m - t).T, rcond=None)
    
    Rfit = np.around(RTfit.T, 6)
    print(f'Original R:\n{R}\nReconstructed R:\n{Rfit}')
    

    Output:

    Original R:
    [[-4.  2.  3.]
     [-1.  0.  2.]
     [-4.  6. -9.]]
    Reconstructed R:
    [[-4.  2.  3.]
     [-1. -0.  2.]
     [-4.  6. -9.]]
    

    Note that you could also use an exact solve using three data points (n=3):

    Rsolve = np.linalg.solve(w.T[:3], (np.linalg.inv(A) @ m[:, :3] - t).T).T
    

    but in this case, you need to pick your three points carefully or it won't work.

    Here is an attempt with your data:

    t = np.array([[0, -1.32, 0]]).T
    w = np.array([
        [4.25, 4.25, 5.32, 5.32, 6.27, 6.27, 7.28, 7.28, 4.20],
        np.zeros(9),
        [0.63, 1.63, 0.63, 1.63, 0.59, 1.59, 0.60, 1.92, 2.92]
        ])
    m = np.array([
        [337, 337, 316, 317, 302, 302, 291, 292, 338],
        [487, 572, 477, 547, 470, 528, 465, 516, 598],
        np.ones(9)
        ])
    A = np.array([
        [711.629,  0, 496.220],
        [712.682, 350.535, 0],
        [0, 0, 1]
        ])
    RTfit, _, _, _ = np.linalg.lstsq(w.T, (np.linalg.inv(A) @ m - t).T, rcond=None)
    Rfit = np.around(RTfit.T, 6)
    print(Rfit)
    

    Output:

    array([[-0.040938,  0.      , -0.016044],
           [ 0.448038,  0.      ,  0.52933 ],
           [ 0.14251 ,  0.      ,  0.127464]])
    

    It cannot meaningfully solve the middle column of the R matrix because all Y values of the input were zero. (If you try this with np.linalg.solve, you'll get a singular-matrix error.)

    The fit is not particularly good, as evidenced by plotting m and A @ (R @ w + t):

    xy coordinates of m: input versus fit

    The mismatch implies that there is no R matrix possible that is consistent with the data. In your comment, you ask whether the R matrix is the most optimal solution. It is the optimal solution in matching the LHS and the RHS of the equation (w.T @ Rfit.T versus (inv(A) @ m - t).T).

    Given the large mismatch in the above plot, it doesn't make much sense to speculate on the accuracy of the resulting R matrix. It is likely that there is a problem with the input data; the points (m, w), the t-vector, or the A matrix.