Linear Algebra – Minimum Least Squares Solution Using Pseudo Inverse

least squareslinear algebramatrix-calculusoptimizationsvd

Given $A \in \mathbb{R}^{m \times n}$, $B \in \mathbb{R}^{k \times \ell}$, and $C\in \mathbb{R}^{m \times \ell}$. Show that for $X \in \mathbb{R}^{n \times k}$

$$ {A}^{\dagger} C {B}^{\dagger} = \arg \min_{X} {\left\| C – A X B \right\|}_{F} $$

is the unique solution.

Note: This is an extension of the minimum $2$-norm least squares problem.

Hint: Use the SVDs of $A$ and $B$.

This is a homework problem, but I got stuck. After I do SVD for both $A$ and $B$, I discard the $0$ terms to get a slim SVD. Then I can factor out some things, but still don't see an explicit expression formulating. Could somebody guide me in the right direction?

Edit:

1."F norm": Frobenius norm

2. $A^+$ or $A^\dagger$: conjugate transpose, Moore–Penrose pseudoinverse

Best Answer

First, I am not exactly sure what you are asking. I am assuming that you know that the minimal norm solution to the least squares problem $\min_x {1 \over 2} \|Ax-y\|^2$ is given by $x=A^{\dagger} y$ and that the issue to show that the solution to problem $\min_X {1 \over 2} \| C-AXB\|_F^2$ is given by $A^{\dagger} C B^{\dagger}$.

Second, note that $A^{\dagger} C B^{\dagger}$ is not necessarily a unique solution, however, of all solutions (and is at least one solution), it is the one of minimal norm. For example, if $A=0$ then any $X$ is a solution, but clearly not unique.

One approach is to determine the SVD with respect to the basis $E_{ij} = e_i e_j^T$ and translate this into the above form.

Note that if we can write $A = \sum_k \sigma_k u_k v_k^T$ (or equivalently, $Ax = \sum_k \sigma_k u_k \langle v_k, x \rangle$ for all $x$), where $\sigma_k \ge 0$, $u_k$ & $v_k$ are orthonormal, then we can extract the SVD from this formulation in the usual sort of way. Then $A^{\dagger} = \sum_{\sigma_k \neq 0} {1 \over \sigma_k } v_k u_k^T$ (or equivalently, $A^{\dagger} x = \sum_{\sigma_k \neq 0} {1 \over \sigma_k } v_k \langle u_k, x \rangle$ for all $x$).

So, we look for a similar expansion for the operator $L(X) = AXB$. \begin{eqnarray} AXB &=& (\sum_k \sigma_k(A) u_k(A)v_k(A)^T ) X (\sum_k \sigma_k(B) u_k(B)v_k(B)^T ) \\ &=& \sum_{k,l} \sigma_k(A) \sigma_l(B) u_k(A)v_k(A)^T X u_l(B)v_l(B)^T \\ &=& \sum_{k,l} \sigma_k(A) \sigma_l(B) u_k(A) v_l(B)^T v_k(A)^T X u_l(B) \\ &=& \sum_{k,l} \sigma_k(A) \sigma_l(B) u_k(A) v_l(B)^T \operatorname{tr} (v_k(A)^T X u_l(B)) \\ &=& \sum_{k,l} \sigma_k(A) \sigma_l(B) u_k(A) v_l(B)^T \operatorname{tr} ( u_l(B) v_k(A)^T X ) \\ &=& \sum_{k,l} \sigma_k(A) \sigma_l(B) u_k(A) v_l(B)^T \langle v_k(A) u_l(B)^T , X \rangle \\ \end{eqnarray} Hence $L$ has singular values $\sigma_k(A) \sigma_l(B) $ with left singular vectors $u_k(A) v_l(B)^T$ and right singular vectors $v_k(A) u_l(B)^T$. A quick check shows that the singular vectors are orthonormal.

Hence $L^{\dagger}$ has the representation \begin{eqnarray} L^{\dagger}(X) &=& \sum_{\sigma_k(A) \sigma_l(B) \neq 0} {1 \over \sigma_k(A) \sigma_l(B)} u_k(A) v_l(B)^T \langle v_k(A) u_l(B)^T , X \rangle \\ &=& \sum_{\sigma_k(A) \sigma_l(B) \neq 0} {1 \over \sigma_k(A) \sigma_l(B)} u_k(A) v_l(B)^T \operatorname{tr} ( u_l(B) v_k(A)^T X ) \\ &=& \sum_{\sigma_k(A) \sigma_l(B) \neq 0} {1 \over \sigma_k(A) \sigma_l(B)} u_k(A) v_l(B)^T v_k(A)^T X u_l(B) \\ &=& \sum_{\sigma_k(A) \sigma_l(B) \neq 0} {1 \over \sigma_k(A) \sigma_l(B)} u_k(A) v_k(A)^T X u_l(B) v_l(B)^T \\ &=& ( \sum_{\sigma_k(A) \neq 0} {1 \over \sigma_k(A)} u_k(A) v_k(A)^T ) X ( \sum_{\sigma_l(B) \neq 0} {1 \over \sigma_l(B)} u_l(B) v_l(B)^T ) \\ &=& A^{\dagger} X B^{\dagger} \end{eqnarray}

Hence the minimal norm solution to the least squares problem is given by $L^{\dagger}(C) = A^{\dagger} C B^{\dagger}$.

Related Question