Extract projective homography from two Kinect depth maps

Given two consecutive 3D point clouds 1 and 2 (not the whole cloud, say 100 points selected from the cloud with OpenCV's GoodFeaturesToMatch), obtained from a Kinect depthmap, I want to compute camera's homography from 1 to 2. I understand that this a projective transform, and it has already been done by many people: here (slide 12), here (slide 30) and here in what seems to be the classic paper. My problem is that whilst I'm a competent programmer, I haven't got the math or trig skills to turn one of those methods into code. As this is not an easy problem, I offer a large bounty for the code that solves the following problem:

The camera is at the origin, looking in the Z direction, at irregular pentahedron [A,B,C,D,E,F]: camera position 1

The camera moves -90mm to the left (X), +60mm up (Y), +50mm forwards (Z) and rotates 5° down, 10° right and -3° anticlockwise: camera position 2

Rotating the entire scene so that the camera is back at its original position allow me to determine the vertices' locations at 2: enter image description here

The 3DS Max files used to prepare this are max 1, max 2 and max 3

Here are the vertices' positions before and after, the intrinsics, etc.: vertices and intrinsics

Note that camera2's vertices are not 100% accurate, there's a bit of deliberate noise.

here are the numbers in an Excel file

The code I need, which must be readily translatable into VB.Net or C#, using EMGUCV and OpenCV where necessary, takes the 2 sets of vertices and the intrinsics and produces this output:

Camera 2 is at -90 X, +60 Y, +50 Z rotated -5 Y, 10 X, -3 Z.
The homography matrix to translate points in A to B is:
a1, a2, a3
b1, b2, b3
c1, c2, c3

I don't know if the homography is 3X3 or 3X4 for homogenous coordinates, but it must allow me to translate the vertices from 1 to 2.

I also don't know the values a1, a2, etc; that's what you have to find >;-)

The 500 bounty offer 'replaces' the bounty I offered to this very similar question, I've added a comment there pointing to this question.

EDIT2: I'm wondering if the way I'm asking this question is misleading. It seems to me that the problem is more of point-cloud fitting than of camera geometry (if you know how to translate and rotate A to B, you know the camera transform and vice-versa). If so, then perhaps the solution could be obtained with Kabsch's algorithm or something similar

Solution

"The correct" algorithm to use for computing difference between two snapshots of 2D or 3D point clouds is called ICP (Iterative Closest Point). The algorithm solves ICP

In human-readable-format: For given point sets P1 and P2 find the rotation matrix R and translation T that transforms P1 to P2. Just make sure they are normalized around their origin.

The algorithm is conceptually simple and is commonly used in real-time. It iteratively revises the transformation (translation, rotation) needed to minimize the distance between the points of two raw scans.

For those interested this is a topic within Computational Geometry Processing