The column space of $A$ is $\operatorname{span}\left(\begin{pmatrix} 1 \\ -1 \\ 1 \end{pmatrix}, \begin{pmatrix} 2 \\ 4 \\ 2 \end{pmatrix}\right)$.
Those two vectors are a basis for $\operatorname{col}(A)$, but they are not normalized.
NOTE: In this case, the columns of $A$ are already orthogonal so you don't need to use the Gram-Schmidt process, but since in general they won't be, I'll just explain it anyway.
To make them orthogonal, we use the Gram-Schmidt process:
$w_1 = \begin{pmatrix} 1 \\ -1 \\ 1 \end{pmatrix}$ and $w_2 = \begin{pmatrix} 2 \\ 4 \\ 2 \end{pmatrix} - \operatorname{proj}_{w_1} \begin{pmatrix} 2 \\ 4 \\ 2 \end{pmatrix}$, where $\operatorname{proj}_{w_1} \begin{pmatrix} 2 \\ 4 \\ 2 \end{pmatrix}$ is the orthogonal projection of $\begin{pmatrix} 2 \\ 4 \\ 2 \end{pmatrix}$ onto the subspace $\operatorname{span}(w_1)$.
In general, $\operatorname{proj}_vu = \dfrac {u \cdot v}{v\cdot v}v$.
Then to normalize a vector, you divide it by its norm:
$u_1 = \dfrac {w_1}{\|w_1\|}$ and $u_2 = \dfrac{w_2}{\|w_2\|}$.
The norm of a vector $v$, denoted $\|v\|$, is given by $\|v\|=\sqrt{v\cdot v}$.
This is how $u_1$ and $u_2$ were obtained from the columns of $A$.
Then the orthogonal projection of $b$ onto the subspace $\operatorname{col}(A)$ is given by $\operatorname{proj}_{\operatorname{col}(A)}b = \operatorname{proj}_{u_1}b + \operatorname{proj}_{u_2}b$.
The theorem you have quoted is true but only tells part of the story. An improved version is as follows.
Let $U$ be a real $m\times n$ matrix with orthonormal columns, that is, its columns form an orthonormal basis of some subspace $W$ of ${\Bbb R}^m$. Then $UU^T$ is the matrix of the projection of ${\Bbb R}^m$ onto $W$.
Comments
- The restriction to real matrices is not actually necessary, any scalar field will do, and any vector space, just so long as you know what "orthonormal" means in that vector space.
- A matrix with orthonormal columns is an orthogonal matrix if it is square. I think this is the situation you are envisaging in your question. But in this case the result is trivial because $W$ is equal to ${\Bbb R}^m$, and $UU^T=I$, and the projection transformation is simply $P({\bf x})={\bf x}$.
Best Answer
1. Not using Grahm-Schmidt Process:
Let $g(x)$ be the orthogonal projection of $f(x)=x+1$ on $W$. Since $g \in W$ then $g(x) = c_1x+c_2e^x$ for some $c_1,c_2 \in\mathbb{R}$. Then $f(x)-g(x)$ is orthogonal to $W$. So, $f(x)-g(x)$ is orthogonal to $x$ and to $e^x$. In other words, $$ \left\{ \begin{array}{l} \int\limits_{-2}^2(x+1 - c_1x - c_2e^x)x{\text d}x = 0,\\ \int\limits_{-2}^2(x+1 - c_1x - c_2e^x)e^x{\text d}x = 0. \end{array} \right. $$
2. Using Grahm-Schmidt Process:
1) Using Grahm-Schmidt Process find an orthonormal base of $\text{span}\{x, e^x\}$.
2) Suppose $g_1(x),g_2(x)$ are the new base vectors (that are orthogonal and normal). Then the projection $g(x)=c_1g_1(x)+c_2g_2(x)$, where
$$c_1 = \int\limits_{-2}^2 f(x)g_1(x){\text d}x \qquad\text{and}\qquad c_2 = \int\limits_{-2}^2 f(x)g_2(x){\text d}x$$
Grahm-Schmidt Process:
a) $g_1(x) = x$.
b) $g_2(x) = k g_1(x) + e^x$. Now, since $\langle g_1, g_2 \rangle$ should be $0$, we can find $$k=-\frac{\langle e^x, g_1 \rangle}{\langle g_1, g_1 \rangle}=-\frac{\int_{-2}^2 xe^x \text{d}x}{\int_{-2}^2 x^2 \text{d}x}.$$
And don't forget to normalize $g_1$ and $g_2$.