Linear Least Squares Problem of a Specific Matrix Form

computational mathematicsleast squareslinear algebramatrix decompositionmatrix equations

Given $A_i\in\mathbb{R}^{k\times N},i=0,1,…,M,$ with $k>N$, and $b_i\in\mathbb{R}^k, i=1,2,…,M$. Consider the following least square problem
$$
\begin{bmatrix}
A_0 & 0 & \cdots & 0 & A_1\\
0 & A_0 & \cdots & 0 & A_2\\
\vdots & \vdots & \ddots & \vdots & \vdots \\
0 & 0 & \cdots & A_0 & A_M\\
\end{bmatrix}
\begin{bmatrix}
x_1 \\ x_2 \\ \vdots \\ x_M \\ x
\end{bmatrix}
=\begin{bmatrix}
b_1 \\ b_2 \\ \vdots \\ b_M
\end{bmatrix}
$$

The goal is to find $x$ only, and all the other $x_i$ are not important. Assume $k$ is very large (say, >10000), while $N$ and $M$ are of medium sizes (say, ~100). How to solve the problem efficiently?

My question is about the validity of the following solution procedure:
First we compute the (reduced, or thin) QR decomposition of each smaller problem
$$ \begin{bmatrix}A_0 & A_i\end{bmatrix}
\begin{bmatrix}x_i\\ x\end{bmatrix}
=Q_iR_i\begin{bmatrix}x_i\\ x\end{bmatrix}
=\begin{bmatrix}\tilde{Q} & \tilde{Q}_i\end{bmatrix}
\begin{bmatrix} R^{11} & R^{12}_i \\ 0 & R^{22}_i \end{bmatrix}
\begin{bmatrix}x_i\\ x\end{bmatrix}
=\begin{bmatrix}b_i\end{bmatrix}
$$

where $\tilde{Q}$ and $\tilde{Q}_i \in \mathbb{R}^{k\times N}$, and the $R$ matrices have respective sizes. Then, since the $Q$ matrix has orthogonal columns, we multiply both sides by $Q^T_i$ to get
$$
\begin{bmatrix} R^{11} & R^{12}_i \\ 0 & R^{22}_i \end{bmatrix}
\begin{bmatrix}x_i\\ x\end{bmatrix}
=\begin{bmatrix}\tilde{Q}^T b_i\\ \tilde{Q}_i^T b_i \end{bmatrix}
$$

Since all $x_i$ are irrelevant, we only keep the lower part of the above equation, for each $i$. Then we obtain the following least square problem
$$
\begin{bmatrix} R^{22}_1 \\ R^{22}_2 \\ \vdots \\ R^{22}_M \end{bmatrix}x
=\begin{bmatrix} \tilde{Q}_1^T b_1 \\ \tilde{Q}_2^T b_2 \\ \vdots \\ \tilde{Q}_M^T b_M \end{bmatrix}
$$

which has considerably smaller size than the original problem, esp. independent of $k$.

I have tested this method in MATLAB, and found that the resulting least square solution $x$ is identical to that in the original problem. However, I cannot theoretically prove that this method yields the correct least solution for $x$. Can anyone help prove this?

Best Answer

The solutions are identical. I'm assuming $A_0^T A_0$ is invertible. We can solve the least squares problem directly to check the equivalency of the two methods. Multiplying the equation by $A^T$ from left yields $$\begin{bmatrix}A_0^TA_0 & & & A_0^TA_1\\ &\ddots&&\vdots \\ &&A_0^TA_0&A_0^TA_M \\A_1^TA_0&\dots &A_M^TA_0&\sum A_i^TA_i\end{bmatrix}\begin{bmatrix}x_1 \\ \vdots \\ x_M \\ x\end{bmatrix}=\begin{bmatrix}A_0^Tb_1 \\ \vdots \\ A_0^Tb_M \\ \sum A_i^Tb_i\end{bmatrix}$$ We can use block matrix inversion to calculate the inverse. But we only need the last row block, which yields the solution $$x=\left(\sum A_i^TP_0A_i\right)^{-1}\sum A_i^TP_0b_i$$ where $P_0=I-A_0(A_0^TA_0)^{-1}A_0^T$. Since $A_0=\tilde{Q}R^{11}$, $$P_0=I-\tilde{Q}R^{11}((R^{11})^TR^{11})^{-1}(R^{11})^T\tilde{Q}^T=I-\tilde{Q}\tilde{Q}^T$$ where the last equality comes from the fact that $R^{11}$ is square and invertible. Also, $$P_0A_i=(I-\tilde{Q}\tilde{Q}^T)(\tilde{Q}R^{12}_i +\tilde{Q}_i R^{22}_i)=\tilde{Q}_i R^{22}_i$$ and hence the result follows.

Related Question