[Math] Degrees of Freedom in Affine Transformation and Homogeneous Transformation

affine-geometrygeometrylinear-transformationsprojective-geometrytransformation

I understand that a 2D Affine Transformation has 6 DOF and a 2D Homogeneous Transformation has 8 DOF. However, how can I identify what those independent paramters are?

If we consider Euclidean Transformation, it has 3 DOF: rotation, translation in x and translation in y.
\begin{bmatrix}C_\theta&-S_\theta&t_x\\S_\theta&C_\theta&t_y\\0&0&1\end{bmatrix}
If we consider Similarity transform, it has 4 DOF: rotation, translation in x, translation in y and scaling.
\begin{bmatrix}sC_\theta&-sS_\theta&t_x\\sS_\theta&sC_\theta&t_y\\0&0&1\end{bmatrix}

1) Similarily, what makes up the 6 DOF of Affine matrix and 8 DOF of Homogeneous matrix?

2) Unlike the Euclidean and Similarity Transformation, is there no fixed set of DOF?

3) Can they be any six (if we take Affine as example) of rotation, translation (in x, y), scaling (in x, y), shearing, reflection etc. depending on the application?

4) If so, given an Affine matrix, can we know what the DOF are without knowledge of application?

Link1 says Affine transformation is a combination of translation, rotation, scale, aspect ratio and shear.
Link2 says it consists of 2 rotations, 2 scaling and traslations (in x, y).
Link3 indicates that it can be a combination of various different transformations.

I am a little confused about the whole idea. Thanks in advance.

Best Answer

I am not an expert and have just starting thinking about this myself. I am intrigued by how many different ways there are to think about transforms / degrees of freedom.

I think the simplest way to see that an Affine transform has 6 degrees of freedom is that there are 6 variables in the matrix:

$$ \begin{bmatrix} m_{00} & m_{01} & m_{02} \\ m_{10} & m_{11} & m_{12} \\ 0 & 0 & 1 \\ \end{bmatrix} $$

No matter what value we choose for any of those variables, it is a valid Affine transform. Although the Similarity transform can also be represented by a 6 variable multiplication matrix, it is more constrained - if we picked 4 of the variables at random, the other 2 we would have to choose carefully in order that it is a valid Similarity transform. So it has less degrees of freedom even though it still can be written as a matrix with 6 variables. Similarly, we can use an Affine transform to describe a simple translation, as long as we set the four left numbers to be the identity matrix, and only change the two translation variables.

The purest mathematical idea of an Affine transform is these 6 numbers and the way you multiply them with a vector to get a new vector. What this transform actually does can be described in a variety of ways - as 6 operations that you are doing one after the other (translate x, translate y, scale x, scale y, rotate, shear), or one thing you are doing all at once. If you think of them in terms of these operations, you might be confused by this matrix:

$$ \begin{bmatrix} -1 & 0 & 0 \\ 0 & -1 & 0 \\ 0 & 0 & 1 \\ \end{bmatrix} $$

This matrix can be thought of as either a rotation by 180 degrees about the origin, or of scaling x by -1 and y by -1, or by reflecting x and y through the origin. All of the transformations are equivalent, and this is the only matrix that describes them.

Another way we could think about degrees of freedom is with how many fingers you would need to describe this transform by dragging points. A translation I can describe with one finger - by dragging a single point to its new location. Open Google Maps on your phone and try it. Each finger counts for two degrees of freedom since you can move it horizontally, and vertically.

A euclidean transform has 3 DOF - you need one finger to translate the shape, then the second finger you can use to rotate it, but this finger only has one degree of freedom. This one is better illustrated not in Google maps, but with a credit card on a desk - one finger moves the card, the other rotates it, but the second finger is less free since it always has to follow the first finger around somewhat. Moving the second finger arbitrarily would try to stretch the card, which is impossible. So, the first finger has 2 DOF, the second finger has one more.

A similarity transform has four degrees of freedom - Google Maps works for this one again. Drag two fingers on your phone on Google Maps at the same time. No matter where you drag your two fingers, the app is able to find a similarity transform for you - one that keeps the map the same shape, but translates, rotates, and scales it.

You would need to drag three fingers to do an Affine transform - Google Maps doesn't support this, since it would skew the map, so you wouldn't be able to navigate it using it anymore - but you can kind of pretend using a hankerchief, two of the fingers can translate and rotate it (pretend they can scale it too) and then the third finger can skew it this way and that. Almost any drag

And, dragging four fingers would let you do a 2D homogenous translation. You can try this in a photo editing program called Gimp. It's under tools > transform tools > perspective, and it lets you drag four different points around - so it counts for 8 degrees of freedom.

Note that not every possible position of the 4 points is necessarily a valid transform, but it still counts as 8 degrees of freedom since the points can still freely move in 2 dimensions - there's just certain values they can't take.

Who knows, I hope this helps!