I'm very confused when it is necessary for a homogeneous coordinate (x, y, z, w)
get divided by w
(ie. converting to (x/w, y/w, z/w, 1)
).
According to this page, when a homogeneous vertex coordinate is passed through a orthographical or projective matrix, the coordinate will be in clip coordinates. Perspective division is necessary for clip coordinates to get NDC. It will produce x, y, and z values in range (-1, 1).
I was following WebGL tutorial pages on orthographical and perspective projection. These tutorials didn't mention a single word about perspective division. I'm not sure if the division is still necessary after multiplying with projective matrix. Perhaps, the division is performed automatically during matrix multiplication?
The perspective divide (the conversion from clip space to NDC space) is neither necessary nor unnecessary; it is a part of the graphics pipeline. It happens automatically to every vertex that passes through the system.
You can make it into a no-op by setting a vertex's W component to 1.0 (which is what the orthographic projection matrix does, assuming the input position had a W of 1.0). But the division always happens.