I'm trying to project a few 3d points to screen coordinates to determine whether a touch occurs in roughly the same area. It should be noted that I'm doing this in Kivy, which is Python and OpenGL. I've seen questions like this, but I still don't have a solution. I've tried the following, but the numbers are not close to screen co-ordinates.
def to2D(self, pos, width, height, modelview, projection):
p = modelview*[pos[0], pos[1], pos[2], 0])
p = projection*p
a = p[0]
b = p[1]
c = p[2]
a /= c
b /= c
a = (a+1)*width/2.
b = (b+1)*height/2.
return (a, b)
To illustrate that this doesn't produce good results, take the following parameters
modelview = [[-0.831470, 0.553001, 0.053372, 0.000000],
[0.000000, 0.096068, -0.995375, 0.000000],
[-0.555570, -0.827624, -0.079878, 0.000000],
[-0.000000, -0.772988, -2.898705, 1.000000]]
projection = [[ 15.763722, 0.000000, 0.000000, 0.000000],
[ 0.000000, 15.257052, 0.000000, 0.000000],
[ 0.000000, 0.000000, -1.002002, -2.002002],
[ 0.000000, 0.000000, -1.000000, 0.000000]]
pos = [0.523355213060808, -0.528964010275341, -0.668054187020413] #I'm working on a unit sphere, so these are more meaningful in spherical coordinates
width = 800
height = 600
With these parameters, to2D
gives screen coordinates of (1383, -274)
I don't think the problem is related to OpenGL and python, rather to the operations involved in getting from 3d to screen coordinates. What I'm trying to do: When a touch occurs, project a 3d point to 2d screen coordinates.
My idea: Take the camera's modelview and projection matrices, a point that I'm interested in, and the touch position, and then make a method to get from the point to the touch position. Get the method, by converting this source code for gluProject into Python
How I've done it:
Take all of the mathematical objects into Sage for computational simplicity.
My touch position is (150, 114.1)
modelview = matrix([[ -0.862734, 0.503319, 0.048577, 0.000000 ],
[ 0.000000, 0.096068, -0.995375, 0.000000 ],
[ -0.505657, -0.858744, -0.082881, 0.000000 ],
[ 0.000000, -0.772988, -2.898705, 1.000000 ]])
projection = matrix([[ 15.763722, 0.000000, 0.000000, 0.000000 ],
[ 0.000000, 15.257052, 0.000000, 0.000000 ],
[ 0.000000, 0.000000, -1.002002, -2.002002 ],
[ 0.000000, 0.000000, -1.000000, 0.000000 ]])
width = 800.
height = 600.
v4 = vector(QQ, [0.52324, -0.65021, -0.55086, 1.])
p = modelview*v4
p = projection*p
x = p[0]
y = p[1]
z = p[2]
w = p[3]
x /= w
y /= w
z /= w
x = x*0.5 + 0.5
y = y*0.5 + 0.5
z = z*0.5 + 0.5
x = x*width
y = y*height #There's no term added because the widget is located at (0, 0)
The result:
x = 15362.18
y = -6251.43
z = 10.14
The revision: Since this is not even close, I went back to steps 8 and 9 and switched the order of multiplication to see what would happen. So now 8. is p = v4*modelview
, and 9. is p = p*projection
. In this case, the vectors are row vectors. Another way of looking at this would be p = modelviewTranspose*v4
and p = projectionTranspose*p
, where the vectors are column vectors.
The result Part 2:
x = 150.29
y = 196.15
z = 0.6357
Recall that the goal is (150, 114.1). The x coordinate is very good, but the y coordinate is not. So I looked at y*z
, which is 124.69. I could live with this answer, although I'm not sure if looking at y*z
is what I should actually be doing
The first problem is here:
p = modelview*[pos[0], pos[1], pos[2], 0])
When you multiple vector with matrix as 4component vector, Last component (w
) must be 1.0
Another one is here:
c = p[2]
a /= c
b /= c
Instead of dividing x and y by z you should divide x, y AND z by w. w is p[4].
In addition to that:
When in doubt, find source code of gluProject
and gluUnproject
, tear it apart and convert to python.
As far as I know, when projecting vector manually to screen, you're supposed to do following:
Convert "position" to 4 component vector, with .w component set to one.
v4.x = v3.x
v4.y = v3.y
v4.z = v3.z
v4.w = 1.0
Multiply 4component by matrices.
Then divide all components by w.
v4.x /= v4.w
v4.y /= v4.w
v4.z /= v4.w
THEN you'll get screen coordinates within +-1.0 range for x and y. (z will be either within 0.0..1.0 or 0.0..-1.0, I forgot which in case of OpenGL).
The reason why w comes to play is because you can't divide via matrix multiplication, so when you need to divide x/y/z by something, you put it into w component, and division is performed after all the matrix multiplications. w also makes translation matrices possible. Any vector with w == 0 cannot be translated using translation matrices, only rotated around origin and deformed with affine transforms ("origin" means point zero of coordinate space - (0.0, 0.0, 0.0) point)
P.S. Also, I don't know how python handles integer to float conversions, but I'd replace a = (a+1)*width/2
with a = (a+1.0)*width/2.0
to explicitly specify you're using floating point numbers here.