I am currently working on a camera pose estimation project using only one marker with ARUCO.
I used Aruco's Marker Detector to detect markers and get the marker's Rvec and Tvec. I understand these two vectors represent the transform from the marker to the camera, which is the marker's pose w.r.t camera. I form a 4 by 4 matrix called T_marker_camera using these two vectors.
Then, I set up a world frame (left handed) and get the marker's world pose, which is a 4 by 4 transform matrix.
I want to calculate the pose of the camera w.r.t the world frame, and I use the following formula to calculate it: T_camera_world = T_marker_world * T_marker_camera_inv
Before I perform the above formula, I convert the OpenCV coordinates to the left handed one (flip the sign of x axis).
However, I didn't get the correct x, y, z of the camera w.r.t the world frame.
What did I miss to get the correct answer?
Thanks
The one equation you gave looks right, so the issue is probably somewhere that you didn't show/describe.
A fix in your notation will help clarify.
Write the pose/source frame on the right (input), the reference/destination frame on the left (output). Then your matrices "match up" like dominos.
rvec and tvec yield a matrix that should be called T_cam_marker.
If you want the pose of your camera in the world frame, that is
T_world_cam = T_world_marker * T_marker_cam
T_world_cam = T_world_marker * inv(T_cam_marker)
(equivalent to what you wrote, but domino)
Be sure that you do matrix multiplication, not element-wise multiplication.
To move between left-handed and right-handed coordinate systems, insert a matrix that maps coordinates accordingly. Frames:
The hand-change matrix for typical frames on screens is an identity but with the entry for Y being a -1
. I don't know why you would flip the X but that's "equivalent", ignoring any rotations.