Search code examples
c++graphics3drenderingpixel

Math behind finding screen coordinates in a path tracer


I have been provided with a framework where a simple path tracer is implemented. What I am trying to do so far is understanding the whole code because I'll need to put my hands on it. Unfortunately I am arrived on a step where I don't actually get what's happening and since I am a newbie in the advanced graphics field I don't manage to "decrypt" this part. The developer is trying to get the coordinates of the screen corners as for comments. What I need to understand is the math behind it and therefore some of the variables that are used. Here is the code:

// setup virtual screen plane
vec3 E( 2, 8, -26 ), V( 0, 0, 1 );
static float r = 1.85f;
mat4 M = rotate( mat4( 1 ), r, vec3( 0, 1, 0 ) );
float d = 0.5f, ratio = SCRWIDTH / SCRHEIGHT, focal = 18.0f;
vec3 p1( E + V * focal + vec3( -d * ratio * focal,  d * focal, 0 ) ); // top-left screen corner
vec3 p2( E + V * focal + vec3(  d * ratio * focal,  d * focal, 0 ) ); // top-right screen corner
vec3 p3( E + V * focal + vec3( -d * ratio * focal, -d * focal, 0 ) ); // bottom-left screen corner
p1 = vec3( M * vec4( p1, 1.0f ) );
p2 = vec3( M * vec4( p2, 1.0f ) );
p3 = vec3( M * vec4( p3, 1.0f ) );

For example:

  • what is the "d" variable and why both "d" and "focal" are fixed?
  • is "focal" the focal length?
  • What do you think are the "E" and "V" vectors?
  • is the matrix "M" the CameraToWorldCoordinates matrix?

I need to understand every step of those formulas if possible, the variables, and the math used in those few lines of code. Thanks in advance.


Solution

  • My guesses:

    E: eye position—position of the eye/camera in world space

    V: view direction—the direction the camera is looking, in world coordinates

    d: named constant for one half—corners are half the screen size away from the centre (where the camera is looking)

    focal: distance of the image plane from the camera. Given its use in screen corner offsets, it also seems to be the height of the image plane in world coordinates.

    M: I'd say this is the WorldToCamera matrix. It's used to transform a point which is based on E

    How the points are computed:

    1. Start at the camera: E

    2. Move focal distance along the view direction, effectively moving to the centre of the image plane: + V * focal

    3. Add offsets on X & Y which will move half a screen distance: + vec3( ::: )

      Given that V does not figure in the vec3() arguments (nor does any up or right vector), this seems to hard-code the idea that V is collinear with the Z axis.

    4. Finally, the points are tranformed as points (as opposed to directions, since their homogenous coordinate is 1) by M.