The image of $A$ is $U$ so any vector in $y\in U$ may be written as $y=Ax$ with $x\in {\Bbb R}^k$. The orthogonal projection of $c$ onto $y=Ax\in U$ is the one which minimizes the distance from $c$ to $U$, or equivalently such that $c-y$ is orthogonal to the image of $A$. This last condition may be written as:
$$ A^T (c - y) = A^T(c - Ax)= 0 \ \ \Leftrightarrow \ \ x=(A^T A)^{-1} A^T c$$ from which you deduce
$$ y= A x=A(A^T A)^{-1} A^T c$$
Suppose that there is $x\in {\Bbb R}^k$ so that $A^TAx=0$ whence so that $|Ax|^2 = x^T A^T A x=0$ so we must have $Ax=0$ but the columns in $A$ are linearly independent so $x=0$. The kernel of $A^TA$ is thus trivial and the (square) matrix is invertible.
Yes, that is true in general.
First, note that by definition the left nullspace of $A$ is the orthogonal complement of its column space (which, by the way, is unique, and so we say "the column space of $A$" rather than "a column space"), because $A^T x = 0$ if and only if $x$ is orthogonal to every column of $A$.
Therefore, if $P$ is an orthogonal projector onto its column space, then $I - P$ is a projector onto its orthogonal complement, i.e., the nullspace of $A^T$. To see this, first note that, by definition, $Px = x$ for all $x$ is in the column space of $A$. Thus,
$(I - P)x = x - P x = x - x = 0$.
On the other hand, if $y$ is in the left nullspace of $A$, then $P y = 0$, and so
$(I - P)y = y - Py = y - 0 = y$.
Edit: also, if $P$ is an orthogonal projector, it is self-adjoint, and so is $I-P$, because the sum of two self-adjoint linear operators is also self-adjoint. Hence, in that case, $I-P$ is also an orthogonal projector.
Best Answer
First note that the column space $R(A)$ is being mapped by $P$ identically to itself. Indeed, for a vector $x$ in the domain, we have $\require{extpfeil}\Newextarrow{\xmapsto}{5,5}{0x27FC}$
$$Ax \,\xmapsto{A^T} A^TAx \,\xmapsto{(A^TA)^{-1}} x \,\xmapsto{A} Ax.$$
On the other hand, for a vector $y \in R(A)^\perp$ recall that $R(A)^\perp = N(A^T)$ so $$y\,\xmapsto{A^T} 0 \,\xmapsto{A(A^TA)^{-1}} 0.$$ Finally, the domain can be decomposed as $R(A) \oplus R(A)^\perp$ so with respect to this decomposition we have $P = I \oplus 0$, which means precisely that $P$ is an orthogonal projection onto $R(A)$.