I'm sorry if it has been ask before but I couldn't find the proper answer to my question.
For a better understanding, let me briefly explain the context of my problem
Context
I have two images (A and B) with non planar object on it. I would like to be able to take the coordinate of a pixel pA from A and project it into B. Since my scene is not planar, I can't use homography. What I want to do is first project my pixel pA into the 3D world and then project the result into the image B to get pB. pA (2D) -> pWorld (3D) -> pB (2D). Fortunately, I know the coordinate z of pworld. My question concerns the first step pA (2D) -> pWorld (3D).
Question
How to project my 2D point pA (u,v) into the world (pWorld=(X,Y,Z)), Z being given? I also have the extrinsic matrix Rt (3x4) and the intrinsic matrix K (3x3) of my camera.
What I tried
I know that :
s*(u v 1)' = K * Rt * (X Y Z)' [1]
s is the scale. But I would like to have the opposite process, Z being given. Something like:
(X Y) = SOMETHING * (u v)
I can rewrite [1] to get
s*(u v 1/s 1/s)' = G * (X Y Z 1)'
with G = (l1 l2 l3 l4) (l means line)
l1 = first line of (K*Rt)
l2 = second line of (K*Rt)
l3 = 0 0 1/Z 0
l4 = 0 0 0 1
G is invertible and I can then have
(X Y Z 1)' = inv(G) * (us vs 1 1)'
But I can't use that since I don't know the scale. I think I'm a bit confused concerning this scale thing. I know usually we normalized to get rid of it but here, I can't.
Maybe that's not the good way to proceed. If someone can explain me the good way, I would be really glad to hear about it.
Thank you in advance.
I found a solution but it is damn ugly.
Let's consider the 3x4 matrix M:
M = K*Rt = (mij) 1<i<3, 1<j<4
For simplification, let's also consider the coefficients A and B:
A = (m12-m32*u)/(m22-m32v)
B = (m31*u-m11)/(m31*v-m21)
The notation explained, let's move on to the system. As I said, the system is:
s*(u v 1)' = M*(X Y Z 1)'
We have 3 equations and 3 unknowns : s, X and Y. We can notice that:
s = m31*X + m32*Y + m33*Z + m34
Note that if you want to project into the camera coordinates system and not in the world coordinates system (similar to a case where there is no rotation and translation), you have s = Z which is a way easier system to solve (example here To calculate world coordinates from screen coordinates with OpenCV)
With this in mind, we can reduce the original system into a system of 2 equations with 2 unknowns (X and Y):
Then, after some calculations, we finally get:
X = [Z*((m23-M33*v)*A-m13+m33*u) + (m24-m34*v)*A-m14+m34*u ] / [A*(m31*v-m21)-m31*u+m11]
Y = [Z*((m13-m33*u)-B*(m23-m33*v)) + m14-m34*u-B*(m24-m34*v)] / [B*(m22-m32*v)-m12+m32*u]
It is the expression of X and Y in function of u, v and Z. I tested that with my project and it was working.
Don't know if there is a cleaner way to compute that (with Matrix and stuff), but that's all I could come up with for now.