Search code examples
3drotationcomputer-visionrotational-matrices

Tiny change in pitch results in 180 degree rotation when calculating orientation from vanishing points?


I must be making a mistake somewhere along the way in my math, but I can't seem to find the flaw in my logic. We can calculate a camera's rotational orientation relative to the some 3D world coordinate system using the location of 2 (of the 3) world system vanishing points. If the camera is nearly aligned with the world system we might find the X-axis vanishing point at position (10000000, 0) on the image plane and the Y-axis vanishing point at position (0, 10000000). This would mean that lines on these axes are nearly parallel in the image. And with K being the camera's internal parameters, we can use (adding a 1 to the vanishing point matrix to make it 3D):

r1 = (K.inverse * vp_x) / norm(K.inverse * vp_x)
r2 = (K.inverse * vp_y) / norm(K.inverse * vp_y)
r3 = cross_product(r1, r2)

To get the 3 columns of the rotation matrix, resulting in something close to:

1  0  0
0  1  0
0  0  1

Which is what we expect, since the system is nearly aligned. However, if we pitch the camera every so slightly, the lines for the Y-axis converge in negative values of y on the image plane. For example, the vanishing point position might now show up at (0, -10000000). But using the same math as before, we end up with a rotation matrix of:

1  0  0
0 -1  0
0  0 -1

Which suggests a 180 degree rotation around the X-axis. Of course, we're facing in nearly the same direction as before though, so a 180 degree rotation doesn't make sense. I must be missing something, but I don't know what. Any suggestions? Thanks!


Solution

  • You start from wrong assumption. (convention: we call the movement around Y axis as pitch. But the Y axis is vertical in the camera convention. In aerospace convention the vertical axis is Z. Sometimes in computer vision it is better to call it "panning". Roll becomes "tilting")

    1) If you have a pitch move, what you are varying is the convergence of X-axis. So moving vy from (0,1e6) to (0,-1e6) is not a pitch move, but a rotation around X axis (roll).

    2) Remind that rotation matrix can have singularities when a sine or a cosine goes to zero. So deriving them in that condition could be a problem

    3) From point 2, remember you said:

    resulting in something close to

    and it makes the whole difference! (I will prove it with some Matlab code) I will reverse the equation projecting the 3 world versors. We are interested in pitch movement so I will project the X-axis versor (1,0,0). Note the Z = 0 (this versor lay ON the image plane)* Let's build a pitch matrix (see this reference)

    R = [cos(a), 0, sin(a);0, 1, 0; -sin(a), 0 , cos(a)];
    disp(R)
    
    M1 = [1; 0; 0];
    u = K*R*M1;
    disp(u);
    M2 = [0; 1; 0];
    v = K*R*M2;
    disp(v);
    M3 = [0; 0; 1];
    w = K*R*M3;
    disp(w);
    

    If I have a small positive pitch (a = 0.02) I have u equal to

      495.9003
       -3.9997
       -0.0200
    

    then with a small negative pitch (a = -0.02) u is equal to

      503.8997
        3.9997
        0.0200
    

    You both normalize this vectors (divide by the norm which is about 500) and in both cases you obtain 1,0,0 but you definetively lose the "direction" information wich is different. Yes, this example is about projected segment, but the idea is the same

    4) You can extimate the angles (in theory) from r3 without negleting the near-zero components.

    ANGLE OF PITCH (rotation around Y) = arctan(r3(1)/r3(3))

    ANGLE OF ROLL (rotation around X) = arcsin(r3(2))

    5) If a rotation angle is positive when it is counterclockwise, the vanishing point should be positive or negative?


    *because you are projecting a point with already lays in the image plane, a point at the infinite from the center of the camera, it could be projected both on the right and on the left. Always handle the "infinite" with care