camera computer-vision 3d-reconstruction stereoscopy

Independent rotation of stereoscopic cameras

I have two cameras pointing at the same scene. When they are parallel to each other, I can convert from a real location to each screen coordinate and from two screen coordinates to a real location.

From a real location to each screen coordinate (the focal f is known):

xl = XL / Z * f
yl = 0
xr = XR / Z * f
yr = 0

From two screen coordinates to a real location:

XL + XR = D
xl / f = XL / Z
xr / f = XR / Z

Z = f * D / (xl + xr)
XL = xl / f * Z
YL = yl / f * Z

The cameras have now two independent three-axis rotations (α, β, ζ) and (α', β', ζ'). This is really their yaw, pitch and roll. The camera first rotates of α along the y axis, then it rotates of β along its new x axis and finally rotates of ζ along its new z axis.

I can still convert from a real location to each screen coordinate by rotating the real position and applying the same formula as the case above:

(AL, BL, CL) = rot33_axis3(ζ) * rot33_axis1(β) * rot33_axis2(α) * (XL, YL, Z)
(AR, BR, CR) = rot33_axis3(ζ') * rot33_axis1(β') * rot33_axis2(α') * (XR, YR, Z)
xl = AL / CL * f
yl = BL / CL * f
xr = AR / CR * f
yr = BR / CR * f

I have tested and the calculated coordinates match the screen.

My problem now is to calculate the real location from the 2 screen coordinates. I'm doing:

(al, bl, cl) = rot33_axis2(-α) * rot33_axis1(-β) * rot33_axis3(-ζ) * (xl, yl, f)
(ar, br, cr) = rot33_axis2(-α') * rot33_axis1(-β') * rot33_axis3(-ζ') * (xr, yr, f)

XL + XR = D
al / f = XL / Z
ar / f = XR / Z

Z = f * D / (al + ar)
XL = al / f * Z
YL = bl / f * Z

Unfortunately, that doesn't work.

My idea is to take the screen coordinate, assign the z value to the focal, apply the rotation matrices in reverse order with negative angles (at this moment, the screen have rotated "back" to planes parallel to the line joining the two cameras) and apply the same formula as in the first case.

What am I doing wrong? Is it wrong to start with (xl, yl, f)?

EDIT 1:

Based on aledalgrande anwer, Here is some opencv code:

//Image is 640x360, focal is 0.42
Matx33d camMat = Matx33d(
0.42f * 640.0f, 0.0f, 320.0f,
0.0f, 0.42f * 360.0f, 180.0f,
0.0f, 0.0f, 1.0f);
Matx41d distCoeffs = Matx41d(0.0f, 0.0f, 0.0f, 0.0f);

Matx31d rvec0, tvec0, rvec1, tvec1;

solvePnP(objPoints, imgPoints0, camMat, distCoeffs, rvec0, tvec0);
solvePnP(objPoints, imgPoints1, camMat, distCoeffs, rvec1, tvec1);
//Results make sense if I use projectPoints

Matx33d rot0;
Rodrigues(rvec0, rot0);
Matx34d P0 = Matx34d(
rot0(0, 0), rot0(0, 1), rot0(0, 2), tvec0(0, 0),
rot0(1, 0), rot0(1, 1), rot0(1, 2), tvec0(0, 1),
rot0(2, 0), rot0(2, 1), rot0(2, 2), tvec0(0, 2));

Matx33d rot1;
Rodrigues(rvec1, rot1);
Matx34d P1 = Matx34d(
rot1(0, 0), rot1(0, 1), rot1(0, 2), tvec1(0, 0),
rot1(1, 0), rot1(1, 1), rot1(1, 2), tvec1(0, 1),
rot1(2, 0), rot1(2, 1), rot1(2, 2), tvec1(0, 2));

Point u0_(353, 156);
Point u1_(331, 94);

Matx33d camMatInv = camMat.inv();
u0.x = u0_.x * camMatInv(0, 0) + u0_.y * camMatInv(0, 1) + 1.0f * camMatInv(0, 2);
u0.y = u0_.y * camMatInv(1, 0) + u0_.y * camMatInv(1, 1) + 1.0f * camMatInv(1, 2);
u1.x = u1_.x * camMatInv(0, 0) + u1_.y * camMatInv(0, 1) + 1.0f * camMatInv(0, 2);
u1.y = u1_.y * camMatInv(1, 0) + u1_.y * camMatInv(1, 1) + 1.0f * camMatInv(1, 2);

Matx14d A1(u0.x * P0(2, 0) - P0(0, 0), u0.x * P0(2, 1) - P0(0, 1), u0.x * P0(2, 2) - P0(0, 2), u0.x * P0(2, 3) - P0(0, 3));
Matx14d A2(u0.y * P0(2, 0) - P0(1, 0), u0.y * P0(2, 1) - P0(1, 1), u0.y * P0(2, 2) - P0(1, 2), u0.y * P0(2, 3) - P0(1, 3));
Matx14d A3(u1.x * P1(2, 0) - P1(0, 0), u1.x * P1(2, 1) - P1(0, 1), u1.x * P1(2, 2) - P1(0, 2), u1.x * P1(2, 3) - P1(0, 3));
Matx14d A4(u1.y * P1(2, 0) - P1(1, 0), u1.y * P1(2, 1) - P1(1, 1), u1.y * P1(2, 2) - P1(1, 2), u1.y * P1(2, 3) - P1(1, 3));

double normA1 = norm(A1), normA2 = norm(A2), normA3 = norm(A3), normA4 = norm(A4);

Matx44d A(
A1(0) / normA1, A1(1) / normA1, A1(2) / normA1, A1(3) / normA1,
A2(0) / normA2, A2(1) / normA2, A2(2) / normA2, A2(3) / normA2,
A3(0) / normA3, A3(1) / normA3, A3(2) / normA3, A3(3) / normA3,
A4(0) / normA4, A4(1) / normA4, A4(2) / normA4, A4(3) / normA4);

SVD svd;
Matx41d u;
svd.solveZ(A, u);

Solution

You cannot use the simplified formula for triangulation if the cameras are rotated (general case). You will have to resort to linear triangulation (or other methods if you want a more accurate result).

// points u0 and u1, projection matrices firstP and secondP

// "Multiple View Geometry in Computer Vision" 12.2 and 4.1.1
cv::Matx14d A1 = u0(0) * firstP.row(2) - firstP.row(0);
cv::Matx14d A2 = u0(1) * firstP.row(2) - firstP.row(1);
cv::Matx14d A3 = u1(0) * secondP.row(2) - secondP.row(0);
cv::Matx14d A4 = u1(1) * secondP.row(2) - secondP.row(1);

double normA1 = cv::norm(A1), normA2 = cv::norm(A2), normA3 = cv::norm(A3), normA4 = cv::norm(A4);

cv::Matx44d A(A1(0) / normA1, A1(1) / normA1, A1(2) / normA1, A1(3) / normA1,
              A2(0) / normA2, A2(1) / normA2, A2(2) / normA2, A2(3) / normA2,
              A3(0) / normA3, A3(1) / normA3, A3(2) / normA3, A3(3) / normA3,
              A4(0) / normA4, A4(1) / normA4, A4(2) / normA4, A4(3) / normA4);

cv::SVD svd;
cv::Matx41d pointHomogeneous;
svd.solveZ(A, pointHomogeneous);