Connect the normal equation to the Gram-Schmidt method of finding an orthogonal vector

gram-schmidtlinear algebraorthogonalityprojection

From this lecture, if given $Q$ a matrix where each column is orthogonal to each other, and $b$ which is some vector we want to project onto the column space of $Q$, the projection matrix is given by $QQ^T$ (as from the normal equations $Qx = b \to Qx=Q(Q^TQ)^{-1}Q^Tb= QQ^Tb$) so the vector orthogonal to the column space of $Q$ is $b – QQ^Tb$.

On page 3 of these notes, given some $v$ and an orthogonal basis $v_1, v_2,…v_n$ for some vector space $W$, the author uses Gram-Schmidt to find that $v-\sum_{j=1}^n \frac{<v,v_j>}{||v_j||^2}v_j$ is orthogonal to $W$.

If I define the columns of $Q$ to contain the orthogonal basis $[v_1 v_2 …. v_n]$, do the two methods give the same result? If so, is there a way to use the first method to arrive at the equation for the second?

Best Answer

You might be having trouble showing the equivalence because you need additional assumptions for the projection matrix to be $QQ^T$ in the first place: it’s actually $Q(Q^TQ)^{-1}Q^T$. With the additional assumption that the columns of $Q$ are unit vectors, this reduces to $QQ^T$. If we remove this restriction, however, we’re no longer guaranteed that $Q^TQ$ is invertible: we need to add the condition that the columns of $Q$ are linearly independent. (Pairwise orthogonality of the columns isn’t enough since the zero vector is orthogonal to everything.) Equivalently, $Q$ must have full column rank.

Setting $Q=\begin{bmatrix}v_1&v_2&\cdots&v_n\end{bmatrix}$, we have by orthogonality of the columns $$Q^TQ = \begin{bmatrix}\langle v_1,v_1\rangle&0&\cdots&0\\0&\langle v_2,v_2\rangle&\cdots&0\\\vdots&\vdots&\ddots&\vdots\\0&0&\cdots&\langle v_n,v_n\rangle\end{bmatrix}$$ with all of the diagonal entries nonzero. We can then expand directly: $$Q(Q^TQ)^{-1}Q^Tb = Q(Q^TQ)^{-1}\begin{bmatrix}\langle v_1,b\rangle \\ \langle v_2,b\rangle \\ \vdots \\ \langle v_n,b\rangle\end{bmatrix} = Q\begin{bmatrix}{\langle v_1,b\rangle \over \langle v_1,v_1\rangle} \\ {\langle v_2,b\rangle \over \langle v_2,v_2\rangle} \\ \vdots \\ {\langle v_n,b\rangle \over \langle v_n,v_n\rangle} \end{bmatrix} = \sum_{j=1}^n {\langle v_j,b\rangle \over \langle v_j,v_j\rangle} v_j.$$

Related Question