Find a transformation $3 \times 3$ matrix given a set of input and output $3 \times1$ matrices

linear algebramatricesmatrix equations

I am trying to find out if it is possible to come up with a transformation $3 \times 3$ matrix that transforms a set of more than three $3 \times 1$ input matrices into an equal sized set of $3 \times 1$ output matrices.

The number of the inputs and outputs has to be more than three. I already managed to do it for a three inputs and outputs by turning it into equations of 3 unknowns.

To give you some context, I am studying color science and we use $3 \times 3$ matrices a lot to transform colors.

One of the things that utilise a $3 \times 3$ transformation matrix is when we film a color chart with a set of unknown color patches. We then take the resulting image and create a transformation matrix that maps the values of the patches in the resulting image to their original known values, in order to correct the colors of the image.

Now, there are a lot of programs that do that automatically, but I am trying to understand the math they use.

As I said, the colors of the patches are each represented in a $3 \times 1$ matrix of the RGB values of the respective patch.

The transformation matrix required is a $3 \times 3$.

Is it possible to find the matrix that transforms all of the inputs to all of the outputs, or at least gets them closer enough.

Best Answer

"Gets them close enough" is the tricky part. How do you measure how far apart they are? If the answer is "I use the usual Euclidean metric," so that the distance from $(a, b, c)$ to $(x, y, z)$ is $\sqrt{(a-x)^2 + (b-y)^2 + (c-z)^2}$, then you can minimize the sum of squared errors by using the Moore-Penrose pseudo-inverse.

Here's (roughly) the trick. You want $$ Mv_i = u_i $$ for some given list of "input" vectors $v_1, v_2, \ldots, v_k$ and "target vectors" $u_1, u_2, \ldots, u_k$, where each of these is in $\Bbb R^3$, and $M$ should be a $3 \times 3$ matrix.

If you put the $v$-vectors into a $ 3 \times k$ matrix (each vector becomes one column of the matrix $V$), and do the same thing for the $u$-vectors, making a matrix $U$, then you're hoping that $$ M V = U $$ where $M$ is $3 \times 3$, and $V$ and $U$ are each $3 \times k$. For $k > 3$, this problem is generally not solvable. But if you look at $$ R = MV - U $$ for a particular $M$, that's a matrix that shows the "errors" -- how far $Mv_i$ is from $u_i$. And if the sum of the squares of the entries of $R$ is small...then you've done well.

Now that is a calculus problem: over all $3 \times 3 $ matrices $M$, consider the function $$ f(M) = \sum_{i,j}(MV - U)^2_{i,j} $$ and minimize it. You take derivatives, set them to zero, and (after a good deal of algebra) you arrive at a formula for $M$.

Let me lead you there in a slightly different way. Let's suppose that there is such a matrix $M$ for your $U$ and $V$ data. Start from $$ MV = U $$ and multiply both sides by $V^t$ to get $$ M (VV^t) = (U V^t) $$ The matrices in parentheses are now $3 \times 3$, and unless your set of $v$-vectors is particularly bad (e.g., they all lie in one plane!), the matrix $V V^t$ is invertible. So you find that $$ M = (UV^t)(V V^t)^{-1} $$ Now that's the answer for $M$ in the case that $MV = U$ does have a solution. The surprising thing (which you get by doing the calculus I mentioned) is that even if $MV = U$ doesn't have an exact solution, the matrix $$ M^{*} = (UV^t)(V V^t)^{-1} $$ is the one for which $f(M)$ is minimized, i.e., for which the transformed $v$-vectors are as close as possible (in the aggregate) to the corresponding $u$-vectors.

Related Solutions

Linear Algebra – Why are $3D$ Transformation Matrices $4 \times 4$ Instead of $3 \times 3$?

I'm going to copy my answer from Stack Overflow, which also shows why 4-component vectors (and hence 4×4 matrices) are used instead of 3-component ones.

In most 3D graphics a point is represented by a 4-component vector (x, y, z, w), where w = 1. Usual operations applied on a point include translation, scaling, rotation, reflection, skewing and combination of these.

These transformations can be represented by a mathematical object called "matrix". A matrix applies on a vector like this:

[ a b c tx ] [ x ]   [ a*x + b*y + c*z + tx*w ]
| d e f ty | | y | = | d*x + e*y + f*z + ty*w |
| g h i tz | | z |   | g*x + h*y + i*z + tz*w |
[ p q r s  ] [ w ]   [ p*x + q*y + r*z +  s*w ]

For example, scaling is represented as

[ 2 . . . ] [ x ]   [ 2x ]
| . 2 . . | | y | = | 2y |
| . . 2 . | | z |   | 2z |
[ . . . 1 ] [ 1 ]   [ 1  ]

and translation as

[ 1 . . dx ] [ x ]   [ x + dx ]
| . 1 . dy | | y | = | y + dy |
| . . 1 dz | | z |   | z + dz |
[ . . . 1  ] [ 1 ]   [   1    ]

One of the reason for the 4th component is to make a translation representable by a matrix.

The advantage of using a matrix is that multiple transformations can be combined into one via matrix multiplication.

Now, if the purpose is simply to bring translation on the table, then I'd say (x, y, z, 1) instead of (x, y, z, w) and make the last row of the matrix always [0 0 0 1], as done usually for 2D graphics. In fact, the 4-component vector will be mapped back to the normal 3-vector vector via this formula:

[ x(3D) ]   [ x / w ]
| y(3D) ] = | y / w |
[ z(3D) ]   [ z / w ]

This is called homogeneous coordinates. Allowing this makes the perspective projection expressible with a matrix too, which can again combine with all other transformations.

For example, since objects farther away should be smaller on screen, we transform the 3D coordinates into 2D using formula

x(2D) = x(3D) / (10 * z(3D))
y(2D) = y(3D) / (10 * z(3D))

Now if we apply the projection matrix

[ 1 . .  . ] [ x ]   [  x   ]
| . 1 .  . | | y | = |  y   |
| . . 1  . | | z |   |  z   |
[ . . 10 . ] [ 1 ]   [ 10*z ]

then the real 3D coordinates would become

x(3D) := x/w = x/10z
y(3D) := y/w = y/10z
z(3D) := z/w = 0.1

so we just need to chop the z-coordinate out to project to 2D.

[Math] Image Processing; find transformation matrix (Rot, Trans)

As I googled, I think I realized what you need. Every row is a 2D point. So "rotation" and "translation" are performed to EVERY row. So given two sets of points written in the matrices $A$ and $B$ you want to find a rotation matrix $R = \left(\begin{matrix} \cos{T} & -\sin{T} \\ \sin{T} & \cos{T} \end{matrix}\right)$ and a translation vector $tr = (t_x, t_y)^t$ such that for every row $B_i$ of $B$ and every row $A_i$ of $A$ you have $B_i^t = R.A_i^t + tr$.

Now, if $R$ and $tr$ exist and if you have two distinct rows from $B$ and $A$ you will have that

$$ B_1^t = R.A_1^t+tr, \qquad B_2^t = R.A_2^t+tr$$

from where after subtraction you will obtain $B_2^t-B_1^t = R.(A_2^t-A_1^t)$ which is a $2\times 2$ linear system for the "variables" $\cos{T}$ and $\sin{T}$. If your points are distinct, you will obtain unique solution for $R$ (and you know that an angle $T$ is determined uniquely by its $\sin$ and $\cos$). Once you have $R$ determined you obtain $tr$ from the first equation: $tr = B_1^t-R.A_1^t$.

Best Answer

Related Solutions

Linear Algebra – Why are $3D$ Transformation Matrices $4 \times 4$ Instead of $3 \times 3$?

[Math] Image Processing; find transformation matrix (Rot, Trans)

Related Question