Linear Algebra – Decomposing an Affine Transformation

geometrylinear algebra

An affine transformation is composed of rotations, translations, scaling and shearing.

In 2D, such a transformation can be represented using an augmented matrix by
$$
\begin{bmatrix} \vec{y} \\ 1 \end{bmatrix} =\begin{bmatrix} A & \vec{b} \ \\ 0, \ldots, 0 & 1 \end{bmatrix} \begin{bmatrix} \vec{x} \\ 1 \end{bmatrix}
$$

vector b represents the translation. Bu how can I decompose A into rotation, scaling and shearing?
I am trying to "interpolate" an affine transform so that, if I have points in a frame 1 and in a frame 3 such that a transform $T$ takes points from frame 1 to points in frame 3, then the interpolated transform would take my points from frame 1 to frame 2. If, for example, an 60° rotation is needed to go from frame 1 to frame 3, then a 30° one would be used to go from frame 1 to frame 2.

Thank you for your help.

Best Answer

I am trying to "interpolate" an affine transform […]

I'm suggesting a different approach, which should be well suited to interpolation, although it does not depend on splitting the transformation into separate elementary operations the way your question suggests. Instead, I'd use fractional powers of a matrix, as I'll describe now.

Suppose $A$ is diagonalizable (which will be the case for most transformations), then there exists an orthogonal matrix $P$ and a diagonal matrix $D$ such that $A=P\,D\,P^{-1}$. The entries of $D$ are the eigenvalues of $A$, which I'll call $\lambda_1$ and $\lambda_2$. Now you can define $A$ raised to the $t$-th power like this:

$$ A^t = P\,D^t\,P^{-1} = P\,\begin{pmatrix}\lambda_1^t&0\\0&\lambda_2^t\end{pmatrix}\,P^{-1} $$

Now if you change $t$ continuously from $0$ to $1$, the matrix $A^t$ will change from identity to the matrix $A$. So this is your interpolation.

One thing you have to be careful about is the fact that the $\lambda_k$ will very likely be a conjugate pair of complex number. You can express them as $\lambda_k=e^{z_k}$, where $z_k=\log\lambda_k$, but the logarithm of a number is only defined up to multiples of $2\pi i$. So in this case, you should make sure that the imaginary part doesn't become too big, namely you want $-\pi\le\operatorname{Im}(z_k)\le\pi$. This ensures that the interpolation will not take more turns for a rotation than actually required. Furthermore, you should maintain $z_2=\bar{z_1}$ to make sure that the interpolating matrices will be real as well. With this choice, $\lambda_k^t=e^{t\cdot z_k}$ is well defined and behaves as you described for the case of rotation.

I've created a proof of concept implementation which you can use to experiment, in order to decide whether this is what you want. Here is a snapshot of the kind of interpolation this will create:

Interpolation snapshot

With that experiment, I realized that one should include the translative part into the matrix as well, in just the way you composed a matrix including $A$ and $b$ in your question. Otherwise, the positions of the interpolated frames will depend on the location of the defining triples in the plane, which I consider undesirable.

Related Question