Looking for a simple explanation of Singular Value Decomposition in practice

3drigid transformationrotationssvd

tl/dr: I'm trying to find the best rotation between two 3d point clouds, and all the answers say "use SVD", but I don't have the math background. However, once I get the concept, hopefully I can use existing libraries?

long version: With my coding background (not as good at the math part), I've gotten as far as getting lots of small vectors representing the offset between pairs of points (nearestNeighbor) between two 3d point clouds. My 'answer' is the 3d rotation of the second cloud that minimizes the vector lengths.

But being upfront: I don't know much about matrix algebra, almost nothing about eigenvalues, and I'm hoping I can get this working without pouring over open the undergrad textbook. I'd love to have a basic understanding of what it was doing, just enough to duct-tape together libraries.

My old plan won't work: According to Averages of Rotations and Orientations in 3-space I can't get a few thousand Rotations between my vector pairs and average them all up.

@Tpofofn pointed me at Wahba's problem (thank you!) which sounds like exactly what I need. Which says to use SVD, which has a nice library available.

However, I don't know:

  1. How to translate the math into code — or if I even need to do so!
  2. Inputs: How to take 10k vectors and prepare a matrix to put into SVD.
  3. Outputs: once I get it to run, how to use the output of the SVD.

Best Answer

First, you must understand basic linear algebra and matrix-vector multiplication. For example, $$\left [ \begin{matrix} m_{11} & m_{12} & m_{13} \\ m_{21} & m_{22} & m_{23} \\ m_{31} & m_{32} & m_{33} \end{matrix} \right ] \left [ \begin{matrix} x_1 \\ x_2 \\ x_3 \end{matrix} \right ] = \left [ \begin{matrix} y_1 \\ y_2 \\ y_3 \end{matrix} \right ]$$ is the same as the system of linear equations $$\left\lbrace ~ \begin{aligned} m_{11} x_1 + m_{12} x_2 + m_{13} x_3 &= y_1 \\ m_{21} x_1 + m_{22} x_2 + m_{23} x_3 &= y_2 \\ m_{31} x_1 + m_{32} x_2 + m_{33} x_3 &= y_3 \\ \end{aligned} \right .$$ except that "Linear Algebra" not only makes it easier to write, but also contains a lot of tools how to efficiently manipulate such systems, and to solve them. (Among other things, of course.)

In general, when we have some matrix $\mathbf{M}$, it has some eigenvectors $\vec{v}_k$ and corresponding eigenvalues $\lambda_k$, such that $$\mathbf{M} \vec{v}_k = \lambda_k \vec{v}_k$$ In other words, when multiplied by the matrix, eigenvectors are those vectors that get only scaled by the corresponding eigenvalue without any change in "direction".

So, if you happened to have a pure rotation/reflection matrix $\mathbf{R}$, such that $$\vec{p}_i = \mathbf{R} \vec{q}_i$$ the axis of rotation of $\mathbf{R}$ is the eigenvector corresponding to the eigenvalue closest to $1$. (The other eigenvalues are often complex; your library may or may not provide those too.)


Singular values are the absolute values of the eigenvalues, $\lvert\lambda_k\rvert$.

Singular value decomposition "decomposes" the matrix $\mathbf{M}$ with all real components (so this will not apply if you use complex numbers!) into three parts, two unitary matrices $\mathbf{U}$ and $\mathbf{V}^T$ (where ${}^T$ denotes transpose, i.e. replacing columns with rows and vice versa, or rotating around the descending diagonal), and a diagonal matrix $\mathbf{\Sigma}$, $$\mathbf{M} = \mathbf{U} \mathbf{\Sigma} \mathbf{V}^{T}$$ This does not immediately look useful, but it turns out that the three parts have very useful properties.

Of particular use is the pseudoinverse: $$\mathbf{M}^{+} = \mathbf{V} \mathbf{\Sigma}^{+} \mathbf{U}^T$$ where $\mathbf{\Sigma}^{+}$ is $\mathbf{\Sigma}$ but all nonzero entries replaced with their reciprocals, i.e. $$\Sigma_{k k}^{+} = \begin{cases} \frac{1}{\Sigma_{k k}}, & \Sigma_{k k} \ne 0 \\ 0, & \Sigma_{k k} = 0 \\ \end{cases}$$

The pseudoinverse is extremely useful, because if you know some matrix $\mathbf{M}$ and vector $\vec{y}$, and want to find vector $\vec{x}$, such that $$\mathbf{M} \vec{x} = \vec{y}$$ and you have the pseudoinverse $\mathbf{M}^{+}$, then $$\vec{x} = \mathbf{M}^{+} \vec{y}$$

In other words, when the elements of $\mathbf{M}$ are real numbers, you can solve the $\mathbf{M} \vec{x} = \vec{y}$ problem for $\vec{x}$ by first doing singular value decomposition of matrix $\mathbf{M}$, $$\mathbf{M} = \mathbf{U} \mathbf{\Sigma} \mathbf{V}^T$$ so that the solution is $$\vec{x} = \mathbf{M}^{+} \vec{y} = \mathbf{V} \mathbf{\Sigma}^{+} \mathbf{U}^T$$ where $\mathbf{\Sigma}^{+}$ is calculated as previously mentioned.


As a practical problem, there are some crucial key points that should affect ones approach:

  • Are the point sets ordered or not? That is, does point $\vec{p}_i$ correspond to $\vec{q}_i$, or to some $\vec{q}_k$, with the mapping between $i$ and $k$ indexes unknown? If the points are extracted from e.g. images, the method of extraction will dictate whether the index is the same in both original and rotated point clouds.

  • Is there translation in addition to rotation? This complicates the picture somewhat, adding three new variables (in addition to the three/four describing the pure rotation matrix) to be solved.

  • Is there additional per-point movement? That is, if the point cloud is not rigid, but may deform or change between the two states, the problem becomes much harder to solve. Iterative methods recommeded.

  • The most generic form of the point cloud problem is $$\vec{a}_i + \vec{\epsilon}_i = \vec{t} + \mathbf{R} \vec{b}_i$$ where $\vec{b}_i$ and $\vec{a}_i$ are the two known locations of point $i$, with $\vec{t}$ a translation, and $\vec{\epsilon}_i$ some point-specific error or movement between the two orientations, is hard. There are solutions (in e.g. molecular dynamics simulations, see "rotation detection" and "rotation elimination"), but they are approximate, and will benefit from iterative refinement. That is, do not try to get a perfect result from the get go, but rather ensure you always refine the result to slightly better.


If we consider only pure rotation, i.e. $$\vec{a}_i = \mathbf{R} \vec{b}_i$$ where $\vec{b}_i$ is the position before rotation, and $\vec{a}_i$ position after rotation, of point $i$, then we can look at Wahba's problem, minimizing $J(\mathbf{R})$, $$J(\mathbf{R}) = \frac{1}{2} \sum_{i=1}^{N} w_i \left\lVert \vec{a}_i - \mathbf{R} \vec{b}_i \right\rVert^2$$ except that $w_i = 2 ~ \forall ~ i$.

In other words, we're trying to find the solution $\mathbf{R}$ where the sum of squared errors in distances to $\vec{a}_i$ after rotation, is minimized. Or, writing the problem as $$\vec{a}_i + \vec{\epsilon}_i = \mathbf{R} \vec{b}_i, \quad i = 1 .. N$$ we are minimizing $\sum \lVert\vec{\epsilon}_i\rVert^2$. In a perfect world, we'd minimize it to zero.

The Wikipedia page explains that the solution is to first construct construct matrix $\mathbf{B}$, $$\mathbf{B} = \sum_{i=1}^{N} \vec{a}_i \vec{b}_i^T$$ where $\vec{b}_i^T$ means the position of point $i$ before rotation as a horizontal vector, $\vec{a}_i$ being a vertical vector, using matrix-matrix multiplication. In other words, $$\mathbf{B} = \sum_{i=1}^{N} \left [ \begin{matrix} X_i x_i & X_i y_i & X_i z_i \\ Y_i x_i & Y_i y_i & Y_i z_i \\ Z_i x_i & Z_i y_i & Z_i z_i \\ \end{matrix} \right ], \vec{a}_i = \left [ \begin{matrix} X_i \\ Y_i \\ Z_i \end{matrix} \right ], \vec{b}_i = \left [ \begin{matrix} x_i \\ y_i \\ z_i \end{matrix} \right ]$$ Then, obtain the singular value decomposition of $\mathbf{B}$: $$\mathbf{B} = \mathbf{U} \mathbf{\Sigma} \mathbf{V}^T$$ Calculate the (product of the) determinants of $\mathbf{U}$ and $\mathbf{V}$, and form a new $3 \times 3$ matrix $\mathbf{T}$: $$\mathbf{T} = \left [ \begin{matrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & \det(\mathbf{U})\det(\mathbf{V}) \end{matrix} \right ]$$ Then, the sought after rotation matrix $\mathbf{R}$ is $$\mathbf{R} = \mathbf{U} \mathbf{T} \mathbf{V}^T$$