[Math] Why SVD is not unique but the Moore-Penrose pseudo inverse is unique

linear algebramatricesmatrix decompositionpseudoinversesvd

I feel confused about the uniqueness of the Moore-Penrose inverse generated from SVD.
For any matrix $A$, if $X$ satisfied $$AXA=A, XAX=X, (AX)^\mathrm{T}=AX, (XA)^\mathrm{T}=XA $$then $X$ is called the Moore-Penrose inverse of $A$.

If $A$ has the SVD(singular value decomposition)$$A=P\left[\begin{matrix}\Lambda_r&0\\0&0\end{matrix}\right]Q^\mathrm{T}$$

then it is easy to prove that$$A^+ = Q\left[\begin{matrix}\Lambda_r^{-1}&0\\0&0\end{matrix}\right]P^\mathrm{T}$$ is a Moore-Penrose inverse.

If $X$ and $Y$ are both Moore-Penrose inverse of $A$, from the equation$$X=XAX=X(AX)^\mathrm{T}=XX^\mathrm{T}A^\mathrm{T}=XX^\mathrm{T}(AYA)^\mathrm{T}=X(AX)^\mathrm{T}(AY)^\mathrm{T}=(XAX)AY=XAY=(XA)^\mathrm{T}YAY=A^\mathrm{T}X^\mathrm{T}A^\mathrm{T}Y^\mathrm{T}Y=A^\mathrm{T}Y^\mathrm{T}Y=(YA)^\mathrm{T}Y=YAY=Y$$
we can see that the Moore-Penrose inverse is unique.

However, the Moore-Penrose inverse depends on the SVD and SVD is not unique. How to explain it?

Best Answer

The non-uniqueness of SVD can be characterized as follows: suppose that $A = P_0 \Sigma Q_0^T$ is one SVD of $A$. Moreover, suppose that the singular values of $A$ are $s_1$ with multiplicity $k_1$, $s_2$ with multiplicity $k_2$, and so forth, with $s_m = 0$ having multiplicity $k_m = n - r$. That is, we have $$ \Lambda_r = \pmatrix{s_1 I_{k_1} \\ & \ddots \\ && s_{m-1} I_{k_{m-1}}}, \quad \Sigma = \pmatrix{\Lambda_r \\ & 0_{k_m}} $$ Then $A = P\Sigma Q^T$ will be a singular value decomposition of $A$ if and only if there exists an orthogonal matrix $U$ such that $P = P_0 U, Q = Q_0U$, and $U$ is a block-diagonal orthogonal matrix of the form $$ U = \pmatrix{U^{(1)}\\ & \ddots \\ && U^{(m)}} $$ where $U^{(j)}$ is (orthogonal and) of size $k_j \times k_j$.


With that in mind: if you'd like to prove that the pseudoinverse as constructed from SVD is well-defined (that is, uniquely defined regardless of one's choice of SVD), then it suffices to show that for any choice of $U$ of the form prescribed above, we have $$ [Q_0U] \pmatrix{\Lambda_r^{-1} \\ & 0} [P_0U]^T = Q_0 \pmatrix{\Lambda_r^{-1} \\ & 0} P_0 $$ It is straightforward (but in my opinion tedious) to show that this holds if we use the block-structure of $\Lambda_r^{-1}$ and block-matrix multiplication.

Related Question