I have two sets of data associated with a video sequence. One contains relative rotation and translation data generated using one algorithm. The other is comprised of ground-truth extrinsic matrices associated with each frame.
I would like to compare the data-sets to determine the disparity between them. My question is, how can I derive the relative translation and rotation from the two extrinsic camera matrices?
If you have camera1
pose P1 = [R1|T1]
and camera2
pose P2 = [R2|T2]
then P1to2 = P2 * P1^-1
.
Intuitively, imagine a simple case in which both cameras have translation zero, camera1
has rotation on X axis +30 degrees and camera2
rotation on X axis of +60 degrees.
P1 = [R1|0] P2 = [R2|0]
So they both differ of a +30 degrees rotation on the X axis:
P1to2 = P2 * P1^-1 = [R2|0] * [R1|0]^-1 = [R2|0] * [R1^1|0]
R2 * R1^1 = 60 - 30 on X = rotation of +30 degrees on X