World coordinates wrong when Raycasting and Rasterizing in the same scene

I am rendering two cubes, both side length 1, centered at the origin (min coords. [-.5, -.5, -.5], max coords. [.5, .5, .5]), one with rasterization (vertices set by vertex shader), another with ray-casting (each pixel casts a ray through a cube). The problem is that the two do not seem to share world-space coordinates when transformed by the projection, view, model matrices and their inverses.

One is drawn by passing vertex coordinates to GL vertex shader, placing gl_Position via the canonical:

gl_Position = proj * view * model * vec4(in_position, 1);

The second is generated by raycasting using the inverse projection, view matrices. Model matrix set to identity.

To generate the ray start position for raycasting, we call: vec4 ray_origin = inverse(view) * vec4(0,0,0,1);

To generate the ray direction for raycasting, we call:

uniform vec2 resolution; // width, height of viewport in pixels

vec3 ray_dir(){

    // Normalized device coordinates, state after perspective divide
    vec2 ndc = (gl_FragCoord.xy / resolution) * 2.0 - 1.0;

    // From normalized screenspace, compute the clip space coordinates
    // Compute the clip space coordinates
    vec4 clip = vec4( ndc.xy, -1, 0);

    // Transform to eye space, using the inverse of the projection matrix
    vec4 eye = inverse(proj) * clip;
    eye = vec4(eye.xy, -1, 0);

    // Transform to world space, using the inverse of the view matrix
    vec4 world = inverse(view) * view_coords;

    return normalize(world.xyz);

}

Finally, to ray-cast the cube we use a simple ray-box intersection function:

uniform vec3 start; // camera position in world space
vec2 cube_intersection() {
    vec3 dir = ray_dir();
    // no intersection means vec.x > vec.y
    float size = 0.5;
    vec3 t_min = (-vec3(size) - start) / dir;
    vec3 t_max = (+vec3(size) - start) / dir;
    vec3 t1 = min(t_min, t_max);
    vec3 t2 = max(t_min, t_max);    
    float t_near = max(max(t1.x, t1.y), t1.z);
    float t_far = min(min(t2.x, t2.y), t2.z);
    return vec2(max(t_near, 0), t_far);
}

Where start is supplied as a uniform holding the camera position, and discarding any fragments where t_near > t_far

It should be noted that there is nothing unusual about the view and projection matrices, the view holds rotation and translation in all the familiar places and the projection matrix is a simple perspective matrix.

The result should be two cubes identically rendered in world space, but this is not what happens!

Here the camera is set a distance of 1 away from the origin, Image @ distance 1

From a distance of 0.5? It is exactly 2.0x larger than it should be Image @ distance 0.5

A camera distance of 10 brings it closer into its correct size, and Image @ distance 10

A distance of 100 makes it fit nearly perfectly within the rasterized box Image @ distance 100

Note that I'm adjusting the FOV significantly in the projection matrix for each of these

As you can see, the ray-cast sphere grows in size the closer the camera gets to the origin, and shrinks in size the farther the camera gets from the origin, asymptotically approaching the correct world-space size.

If I multiply the start (camera) position by (1.0 / length(start) + 1.0) then all of a sudden we get the correct world-space size for the cube!

Here it is at a distance 1 again: Image @ distance 1 again huzzah

uniform vec3 start; // camera position in world space
vec2 cube_intersection() {
    vec3 dir = ray_dir();
    // no intersection means vec.x > vec.y
    float size = 0.5;
    vec3 t_min = (-vec3(size) - start*(1.0 / length(start) + 1.0)) / dir; // **here**
    vec3 t_max = (+vec3(size) - start*(1.0 / length(start) + 1.0)) / dir; // **here**
    vec3 t1 = min(t_min, t_max);
    vec3 t2 = max(t_min, t_max);    
    float t_near = max(max(t1.x, t1.y), t1.z);
    float t_far = min(min(t2.x, t2.y), t2.z);
    return vec2(max(t_near, 0), t_far);
}

The problem is, if the camera points away from the origin by any amount at all, the cube begins to translate by 2x what it should if it were rasterized by the vertex shader rather than raycasted by the fragment shader!

Here we are looking at the point (0,0,0.5) from the camera location (sqrt(3), sqrt(3), sqrt(3)) toward [0,0,0.5] from [0.707, 0.707, 0.707]

I have a sense that this has something to do with the w component which divides vertex positions during rasterization (after the fragment shader), but I am not sure what exactly the issue is or how to remedy it during raycasting.

Surely it is a not fundamental limitation that raycasting and rasterization cannot mix in worldspace?

Solution

Solved this one with the help of this Khronos post from 2014.

Ultimately, the idea is to calculate the near and far points of the clip-space frustum, which are (X,Y,-1,1) and (X,Y,1,1) respectively. The ray origin is the near point and the ray direction is the normalized vector between the far point and near point.

The key to this working is to calculate those variables in the vertex shader for all the vertices of the fullscreen quad,

in vec2 in_texcoord_0; // the uvs of the fullscreen quad!
uniform mat4 proj, view, model;
out vec4 v_nearpos, v_farpos;

void main(){ // VERTEX SHADER
    v_nearpos = inverse(proj*view*model) * vec4(in_texcoord_0.xy * 2 - 1, -1, 1) ;
    v_farpos = inverse(proj*view*model) * vec4(in_texcoord_0.xy * 2 - 1, +1, 1) ;
    // etc...
}

and then, in the fragment shader (after interpolation from the vertex shader!), divide by their w component to get the correct reciprocal depth.

    vec3 nearpos = v_nearpos.xyz / v_nearpos.w;
    vec3 farpos = v_farpos.xyz / v_farpos.w;

    vec3 ray_pos = nearpos;
    vec3 ray_dir = normalize(farpos - nearpos);

yielding a correct world space cube using the cube intersection function:

vec2 cube_intersection(vec3 start, vec3 dir) {
    // no intersection means vec.x > vec.y
    float size = 0.5;

    vec3 t_min = (-vec3(size) - start) / dir;
    vec3 t_max = (+vec3(size) - start) / dir;

    vec3 t1 = min(t_min, t_max);
    vec3 t2 = max(t_min, t_max);
    
    float t_near = max(max(t1.x, t1.y), t1.z);
    float t_far = min(min(t2.x, t2.y), t2.z);

    return vec2(t_near, t_far);
}

Hope this helps someone else! Upvote if it helped you!

I think the reason this took me all week to figure out is because unless you get really close to the origin, the effect is not obvious! Plus, every raycasting shader toy code has no need for interop with vertex shader geometry, so they just use equivalents of code I was using before!