Accurate (and fast) angle matching

For a hobby project I'm attempting to align photo's and create 3D pictures. I basically have 2 camera's on a rig, that I use to make pictures. Automatically I attempt to align the images in such a way that you get a 3D SBS image.

They are high resolution images, which means a lot of pixels to process. Because I'm not really patient with computers, I want things to go fast.

Originally I've worked with code based on image stitching and feature extraction. In practice I found these algorithms to be too inaccurate and too slow. The main reason is that you have different levels of depth here, so you cannot do a 1-on-1 match of features. Most of the code already works fine, including vertical alignment.

For this question, you can assume that different ISO exposion levels / color correction and vertical alignment of the images are both taken care of.

What is still missing is a good algorithm for correcting the angle of the pictures. I noticed that left-right pictures usually vary a small number of degrees (think +/- 1.2 degrees difference) in angle, which is enough to get a slight headache. As a human you can easily spot this by looking at sharp differences in color and lining them up.

The irony here is that you spot it immediately as a human if it's correct or not, but somehow I'm not able to learn this to a machine. :-)

I've experimented with edge detectors, Hough transform and a large variety of home-brew algorithms, but so far found all of them to be both too slow and too inaccurate for my purposes. I've also attempted to iteratively aligning vertically while changing the angles slightly, so far without any luck.

Please note: Accuracy is perhaps more important than speed here.

I've added an example image here. It's actually both a left and right eye, alpha-blended. If you look closely, you can see the lamb at the top having two ellipses, and you can see how the chairs don't exactly line up at the top. It might seem negliable, but on a full screen resolution while using a beamer, you will easily see the difference. This also shows the level of accuracy that is required; it's quite a lot.

Example image - blended

The shift in 'x' direction will give the 3D effect. Basically, if the shift is 0, it's on the screen, if it's <0 it's behind the screen and if it's >0 it's in front of the screen. This also makes matching harder, since you're not looking for a 'stitch'.

Basically the two camera's 'look' in the same direction (perpendicular as in the second picture here: http://www.triplespark.net/render/stereo/create.html ).

The difference originates from the camera being on a slightly different angle. This means the rotation is uniform throughout the picture.

Solution

I have once used the following amateur approach.

Assume that the second image has a rotation + vertical shift mismatch. This means that we need to apply some transform for the second image which can be expressed in matrix form as

x' = a*x + b*y + c
y' = d*x + e*y + f

that is, every pixel that has coordinates (x,y) on the second image, should be moved to a position (x',y') to compensate for this rotation and vertical shift.

We have a strict requirement that a=e, b=-d and d*d+e*e=1 so that it is indeed rotation+shift, no zoom or slanting etc. Also this notation allows for horizontal shift too, but this is easy to fix after angle+vertical shift correction.

Now select several common features on both images (I did selection by hand, as just 5-10 seemed enough, you can try to apply some automatic feature detection mechanism). Assume i-th feature has coordinates (x1[i], y1[i]) on first image and (x2[i], y2[i]) on the second. We expect that after out transformation the features have as equal as possible y-coordinates, that is we want (ideally)

y1[i]=y2'[i]=d*x2[i]+e*y2[i]+f

Having enough (>=3) features, we can determine d, e and f from this requirement. In fact, if you have more than 3 features, you will most probably not be able to find common d, e and f for them, but you can apply least-square method to find d, e and f that make y2' as close to y1 as possible. You can also account for the requirement that d*d+e*e=1 while finding d, e and f, though as far as i remember, I got acceptable results even not accounting for this.

After you have determined d, e and f, you have the requirement a=e and b=-d. This leaves only c unknown, which is horizontal shift. If you know what the horizontal shift should be, you can find c from there. I used the background (clouds on a landscape, for example) to get c.

When you know all the parameters, you can do one pass on the image and correct it. You might also want to apply some anti-aliasing, but that's a different question.

Note also that you can in a similar way introduce quadratic correction to the formulas to account for additional distortions the camera usually has.

However, that's just a simple algorithm I came up with when I faced the same problem some time ago. I did not do much research, so I'll be glad to know if there is a better or well-established approach or even a ready software.