I built a stereoscopic camera mobile app which performs automatic alignment using findTransformEcc and the app is working pretty well with it. I know I should probably be using rectifyStereoUncalibrated preceded by keypoints and descriptors etc. etc. but I get bad results from that despite many different approaches attempted and I'm super frustrated. So instead, I'm sticking with findTransformEcc (at least for now). At the moment I'm using MotionType.Euclidean (restricted to translations and rotations) but I would like to change that.
So far, the app has worked by having the user take one picture and move to the side to capture the next (chacha method). But now I'm adding the ability to have two phones capture simultaneously. The problem is that the focal length and sensor size (angular field of view) may be different between the two cameras, so in order to align the two pictures I need to allow scaling/zooming. However, if I want to do that with findTransformEcc I can only step up from Euclidean to Affine, it seems like I can't go between. That is, it seems I cannot allow scaling without also allowing shearing, and I don't want shearing.
As another way to explain this, I'd like to get the type of transform that you can get from estimateRigidTranform(array,array,FALSE) (partial affine) but rather than using keypoints as that function does, I want to use findTransformEcc because from my experimentation it just seems to be more reliable.
(https://github.com/KRA2008/crosscam/blob/develop/AutoAlignment/OpenCV.cs is the auto-alignment code if that helps at all)
Take a look at Fourier-Mellin transform based approach: https://github.com/Smorodov/LogPolarFFTTemplateMatcher
It will give you offset, scale and rotation parameters, nothing more.