[Math] How to find an all-in-one 2D to 3D Transformation Matrix for perspective projection, rotation, and translation

3dtransformation

I have read Finding a 3D transformation matrix based on the 2D coordinates but I think my situation is different because I think I need a 4×3 matrix, not a 3×3 matrix. I'm not sure but this might be because I have rotation and translation in addition to just the perspective transformation.

Here is the setup:

suppose you have several 2D points in an image:
(x1,y1)
(x2,y2)
(x3,y3)
(x4,y4)

suppose you also have several corresponding 3D points on an arbitrary plane:
(X1,Y1,Z1)
(X2,Y2,Z2)
(X3,Y3,Z3)
(X4,Y4,Z4)

to transform from 2D to 3D using homogenous coordinates, we can use

(X,Y,Z,W) = M*(x,y,1). Here M must be a 4×3 matrix

So a 2D-homogenousCoords point gets transformed into a 3D-homogenousCoords point.

Then, I could divide (X,Y,Z,W) by W to get (X,Y,Z,1), which is a form that I can read out the true X,Y,Z values in "regular" coordinates.

Now, here is a problem. I don't know what is W for any of my (X,Y,Z) points. (If I did know each point's W, I think there are standard linear algebra way for finding M.)

So to find M, I multiply things out like the following:

X = M11*x + M12*y + M13*1

Y = M21*x + M22*y + M23*1

Z = M31*x + M32*y + M33*1

W = M41*x + M42*y + M43*1

but these X,Y,Z,W are the homogenous coords, so to get the "real" X,Y,Z coords:

X = (M11*x + M12*y + M13*1) / (M41*x + M42*y + M43*1)

Y = (M21*x + M22*y + M23*1) / (M41*x + M42*y + M43*1)

Z = (M31*x + M32*y + M33*1) / (M41*x + M42*y + M43*1)

also, I can get rid of one parameter from each equation by multiplying each equation by (1/M43)/(1/M43). Then I can also rename the ratio of parameters. I'm left with:

X = (a1*x + a2*y + a3*1) / (a10*x + a11*y + 1)

Y = (a4*x + a5*y + a6*1) / (a10*x + a11*y + 1)

Z = (a7*x + a8*y + a9*1) / (a10*x + a11*y + 1)

finally I plug in all the (X,Y,Z) and (x,y,z) values that I have into multiple instances of these equations and algebraically re-arrange everything to get the classic A=Bx form, where x is vector of unknown a's (a1 … a11).

Once I have a1 through a11, I could go back and work out what the original components of M were. Either way I can now project points from 2D to 3D using perspective transformation even if there is rotation or translation.

My question is whether this is the best way to find this kinds of general 2D to 3D perspective transformation?

Best Answer

It looks like you are trying to solve for a map from 2D points to 3D points, so I'm a bit confused... a projection transformation would map the 3D points to the 2D points (and the inverse is, of course, impossible since each point on the projection plane could lie anywhere on a ray form the camera through the plane.)

Next, notice there is no difference between first transforming an object in 3D, and then projecting through a fixed camera, versus leaving the object in place and projecting through a camera of unknown position and orientation. Here I'll take the former approach.

We have some points in 3D and apply an affine transformation to them, then project through a camera at the origin looking down the $z$ axis, with the projection plane passing through $z=1$. This makes the projection matrix $P: (x,y,z) \to (u,v,w)$ easy: it is just the identity.

Before we project we apply some affine transformation $Mq + t$ to the 3D points $q$. Notice that we do not constraint $M$ to only rotate and scale here: to do so we would need to add additional (nonlinear) constraints on the coefficients of $M$. The short of it is that you will need to supply more than the theoretical minimum of four corresponding points to determine the map (and you will get shear if your corresponding points did not come from a bona fide Euclidean motion + projection.)

So now the total map can be written as

$$\left[\begin{array}{c}u\\v\\w\end{array}\right] = \left[\begin{array}{cccc}m_{11} & m_{12} & m_{13} & t_x\\m_{21} & m_{22} & m_{23} & t_y\\m_{31} & m_{32} & m_{33} & t_z\end{array}\right]\left[\begin{array}{c}x\\y\\z\\1\end{array}\right].$$

Since $(u,v,w) \sim (u/w,v/w,1)$, this map is scale-invariant, so we might as well set $m_{33} = 1$. We can also write it in block form (which will prove useful) as

$$\left[\begin{array}{c}u\\v\\w\end{array}\right] = \left[\begin{array}{c}N_{uv}\\N_w\end{array}\right]\left[\begin{array}{c}x\\y\\z\\1\end{array}\right].$$

Like you say, we only know $u/w$ and $v/w$ for the corresponding points, not $u,v,w$. Well, $$\left[\begin{array}{c}u/w\\v/w\end{array}\right] = N_{uv}\left[\begin{array}{c}x\\y\\z\\1\end{array}\right]/N_w \left[\begin{array}{c}x\\y\\z\\1\end{array}\right],$$ or $$N_w \left[\begin{array}{c}x\\y\\z\\1\end{array}\right]\left[\begin{array}{c}u/w\\v/w\end{array}\right] = N_{uv}\left[\begin{array}{c}x\\y\\z\\1\end{array}\right]$$

which is a system of two linear equations in 11 unknowns. Plugging in $5\frac{1}{2}$ corresponding points will let you solve for $N$.

Related Solutions

[Math] Finding the transformation matrix to transform one triangle into another

A simple way to do this would be to canonicalize both triangles and then concatenate the transform that canonicalizes the "origins" with the inverse of the one that canonicalizes the "destinies".

For instance, a) apply a translation to move the point in the first pair to the origin; b) apply a rotation about an axis through the origin to move the point in the second pair onto the $z$-axis; c) apply a rotation about the $z$-axis to move the point in the third pair into the $y$-$z$-plane with non-negative $y$ coordinate. That moves all congruent triangles with identically numbered points into the same position.

[Math] How to calculate the rotation matrix between 2 3D triangles

In principle you need two rotations and a matrix-multiplication. Call the recentered matrices Q ad S and P as T.
Then find the rotation s, which rotates S to "triangular" form of its coordinates, so that the coordinates of the first point becomes $[x_{s,1},0,0]$, of the second becomes $[x_{s,2},y_{s,2},0]$. This can be done with your ansatz of using unknowns, if you assume three rotations (for each pair of planes 1 rotation, and gives a 3x3-matrix whose 3'rd column is zero and describes thus a triangle in a plane)

Then do the same with another rotation t which rotes T in the same manner.
Then $T = S \cdot s \cdot t^{-1} = S \cdot r$ where r is the matrix for the complete rotation.

A bit of pseudocode (code in my proprietary MatMate-tool):

  S = Q - Meanzl(Q)        // do the translation to the origin
  T = P - Meanzl(P)      

    // get the rotation-matrices which rotate some matrix to triangular shape 
  ts = gettrans(S,"tri") 
  tt = gettrans(T,"tri") 

    // make one rotation-metrix. the quote-symbol means transposition
  tr = ts * tt'  

    // difference should be zero      
  CHK = T - S*tr

Example:
We assume S and T being centered. Let
$ \small \text{ S =} \begin{bmatrix} 14.469944&22.964690&-7.581631\\ -15.275348&5.923432&23.720255\\ 0.805404&-28.888122&-16.138624 \end{bmatrix} $
and
$ \small \text{ T =} \begin{bmatrix} 22.808501&2.515200&16.361035\\ 8.393637&-5.071089&-27.109127\\ -31.202138&2.555889&10.748092 \end{bmatrix} $
Now you can solve for a rotation in y/z-plane, which makes the entry in $S_{1,3}=0$. The rotation-parameters are some cos/sin-values. Apply this and you get
$\small \text{ S}^{(1)} = \begin{bmatrix} 14.469944&24.183840&0.000000\\ -15.275348&-1.811476&24.381471\\ 0.805404&-22.372364&-24.381471 \end{bmatrix}$ Now you can solve for a rotation in x/y-plane, which makes the entry in $S_{1,2}=0$. The rotation-parameters are some other cos/sin-values. Apply this and you get
$\small \text{S}^{(2)} = \begin{bmatrix} 28.182218&0.000000&0.000000\\ -9.397482&12.178056&24.381471\\ -18.784736&-12.178056&-24.381471 \end{bmatrix}$
After that a third set of rotation-parameters cos/sin-values make S triangular. It looks then like this
$\small \text{ S}^{(3)} = \begin{bmatrix} 28.182218&0.000000&0.000000\\ -9.397482&27.253645&0.000000\\ -18.784736&-27.253645&0.000000 \end{bmatrix}$

Because the center of your original triangle was moved to the origin, the last column (the z-coordinates) are zero/not needed, since 3 points can always be placed in a plane.
From the three rotation with their cos/sin-values you can construct a 3x3-rotation-matrix, say s.

The same can be done using the matrix T leading to a rotation-matrix t. If S and T describe in fact the same triangle, only rotated, the results are equal: $T^{(3)}=S^{(3)}$. Then you can use the nice fact, that the inverse of a rotation-matrix is just its transpose, such that with
$\small \text{ tr =} \begin{bmatrix} 0.205215&0.860645&0.466022\\ 0.946329&-0.295966&0.129867\\ 0.249696&0.414359&-0.875190 \end{bmatrix}$ we get $ T = S \cdot tr $

In principle this can also be solved using the concept of pseudoinverses: we demand
$ tr = S^{-1} \cdot T $ . But because S has reduced rank the inverse means to divide by zero. Using SVD-decomposition (see wikipedia) the pseudoinverse can be computed if the inverse of the diagonal matrix of the SVD-factors is used (where zeros are simply left zero). This should lead to the same solution.

Best Answer

Related Solutions

[Math] Finding the transformation matrix to transform one triangle into another

[Math] How to calculate the rotation matrix between 2 3D triangles

Related Question