I have a set of points that are effectively 3 vertices of a 45-45-90 right triangle and some other points, a
, that should map to them.
import numpy as np
points = np.array([
( 90, 416),
(398, 390),
(374, 84)
])
a = np.array([
(0, 1), # maps to (90, 416)
(1, 1), # maps to (398, 390)
(1, 0) # maps to (374, 84)
])
I want to find the similarity transformation that properly maps a
to points
.
from skimage import transform
# transformation that makes sense to me
T1 = transform.estimate_transform(
ttype="similarity",
src=a,
dst=points
)
# invert the rotation for no reason
# other than to show that it works
T2 = transform.SimilarityTransform(
scale=T1.scale,
rotation=-T1.rotation,
translation=T1.translation
)
# apply transformations via matrix multiplication
a_T1 = a @ T1.params[:2, :2] + T1.params[:2, 2]
a_T2 = a @ T2.params[:2, :2] + T2.params[:2, 2]
Why is it that T2
(where I just inverted the rotation for no real reason other than I eventually found out that it works) yields the better mapping? Or am I making a dumb mistake in my implementation?
It's not you. There's lots of different conventions around transformation matrices and it just so happens that
(M @ x.T).T
as the conventionT2.params[:2, :2]
is equal to T1.params[:2, :2].T
(This is not generally true, if you have skew.)But, I personally am always confused by the conventions here myself. To figure it out I had to know that skimage transforms have a __call__
method, so you can just apply them with T1(a)
, which gives the correct results. So now it was just a matter of looking at the source code and seeing that it calls _apply_mat
, which looks like this:
def _apply_mat(self, coords, matrix):
ndim = matrix.shape[0] - 1
coords = np.array(coords, copy=False, ndmin=2)
src = np.concatenate([coords, np.ones((coords.shape[0], 1))], axis=1)
dst = src @ matrix.T
# below, we will divide by the last dimension of the homogeneous
# coordinate matrix. In order to avoid division by zero,
# we replace exact zeros in this column with a very small number.
dst[dst[:, ndim] == 0, ndim] = np.finfo(float).eps
# rescale to homogeneous coordinates
dst[:, :ndim] /= dst[:, ndim : ndim + 1]
return dst[:, :ndim]
For your transform (which is affine), this is essentially equivalent to (T1.params[:2, :2] @ a.T).T + T1.params[:2, 2]
, which is equal to a @ T1.params[:2, :2].T + T1.params[:2, 2]
, again, the transpose of what you were computing and equal to a @ T2.params[:2, :2] + T2.params[:2, 2]
.
Even though it's confusing to have multiple conventions out there, I hope this clarifies why things were looking wonky!