Search code examples
javalwjglcg

What is the exact difference between the camera transformation matrix and the view matrix (using OpenGL and lwjgl)


I've read a lot online before asking this question, and I came to understand that the View matrix is just the inverse of the Camera Transformation matrix. For the sake of clarity, if we treat the Camera as an actual entity that is transformed like any other 3D object in the scene (so with a transformation matrix, that first translate, then rotate and then scale the object) we obtain the camera transformation matrix which contains the position of the camera. If we invert this matrix we should obtain the view matrix, but this is not what happens in my code. I have two static methods: one that creates the transformation matrix given the position, the rotation of the 3 axes and one value of scaling that is applied to all the axes (first translate, then rotate, then scale) and another one that creates the view matrix given the Camera which has a yaw (rotation of the y axis), a pitch (rotation around the x axis) and a Vec3 which represents the position (in here we first rotate the camera and then translate it with its negative position, because moving the camera is the same as moving the world around it). Here is the code for the transformation matrix:

public static Matrix4f createTransformationMatrix(Vector3f translation, float rx, float ry,
        float rz, float scale) {
    
    Matrix4f matrix = new Matrix4f();
    matrix.setIdentity();
    
    Matrix4f.translate(translation, matrix, matrix);
    
    Matrix4f.rotate((float)Math.toRadians(rx), new Vector3f(1, 0, 0), matrix, matrix);
    Matrix4f.rotate((float)Math.toRadians(ry), new Vector3f(0, 1, 0), matrix, matrix);
    Matrix4f.rotate((float)Math.toRadians(rz), new Vector3f(0, 0, 1), matrix, matrix);
    
    Matrix4f.scale(new Vector3f(scale, scale, scale), matrix, matrix);
    
    return matrix;
}

Here is the code for the view Matrix:

public static Matrix4f createViewMatrix(Camera camera) {
    Matrix4f viewMatrix = new Matrix4f();
    viewMatrix.setIdentity();
    
    Matrix4f.rotate((float) Math.toRadians(camera.getPitch()), new Vector3f(1, 0, 0), viewMatrix, viewMatrix);
    Matrix4f.rotate((float) Math.toRadians(camera.getYaw()), new Vector3f(0, 1, 0), viewMatrix, viewMatrix);
    
    Vector3f cameraPos = camera.getPosition();

    Vector3f negativeCameraPos = new Vector3f(-cameraPos.x, -cameraPos.y, -cameraPos.z);
    Matrix4f.translate(negativeCameraPos, viewMatrix, viewMatrix);
    
    return viewMatrix;
}

Here comes the problem: since I followed a tutorial on Youtube on how to build these two matrices and I did not write this code myself, I don't understand how is the viewMatrix an inversion of the camera transformation matrix. I've noticed that in createViewMatrix() we first rotate and then translate (with the negative position) while in createTransformationMatrix() we first translate then rotate then scale. So if I understand things correctly, I can create a transformation matrix with the Camera data and then invert it to obtain the View matrix, but it doesn't work. I also tried, in createViewMatrix(), to first translate with the positive position (without computing the negativeCameraPos) then rotate and then invert the matrix. Same result: it doesn't work, weird things happen when I run the program (I don't know how to explain them, but they're just wrong). I tried a lot of other things, but it only works with the code I provided. Can you explain me how first rotating and then translating with the negative camera position provides the inverted camera transformation matrix please? I'm so sorry for the prolixity, but I want you to understand my problem at the first shot so that you can answer me. Thank you.


Solution

  • Your basic understand of the camera and view matrix is correct. The camera is normally used to describe the position and orientation of the viewer/camera in the world while the view matrix would be used to transform from world space to view space so it should be the inverse of the camera matrix.

    Note that in matrix math there is a difference in the order in which transformations are applied: rotating and then translating is different from translating and then rotating (we'll leave scaling out of the equation here since you normally don't scale a camera - zooming would be done via the projection matrix).

    When building your camera matrix you'd first rotate to set camera orientation and then translate to set camera position, i.e. you treat the camera as sitting at 0/0/0, looking along the z axis (so the view vector would be 0/0/1). After rotating you get a different normalized view vector but the camera would still "sit" at 0/0/0. Then you translate to the actual camera position (you might need additional matrix operations to calculate that position but I'd do that in a separate step for starters - until you get things right).

    Can you explain me how first rotating and then translating with the negative camera position provides the inverted camera transformation matrix please?

    It shouldn't as the resulting view matrix would apply a different direction. The "negative" rotation (i.e. angle +/- 180 degrees) should work though. In that case you rotate a vector to point to the camera (so if the camera turns 45 degrees around the y-axis any object "pointing to the camera" would need to rotate by 225 or -135 degrees around the same axis).

    Negative translation is ok since if you move the camera to 4/3/2 in world space a translation by -4/-3/-2 would move any coordinate in world space into view space.