Trying to understand projection onto a plane, subspace of R^3, of a point b when there is no solution for Ax = b

linear algebralinear-transformationsprojection

I was following along some online classes for linear algebra and was doing fine until it came to projections onto subspaces and I really really could not understand the proof and logic behind the given formula to complete the projection.

So I am aware that if my problem was practical and I just wanted to solve the projection I could use the formula which is easy to implement and compute, but the problem is I want to understand it and I can't. So my approach is I decided I would attempt it from scratch and forget about the formula so that I could try to come up with a solution for doing the projection myself. I limited myself to R^3 projecting onto a plane subspace because that is most easy for me to picture in terms of geometry and I suspect generalizing to higher dimensions would be much much harder. So here is what I did:

We start with any two independent vectors spanning a plane in R^3. We turn them into unit vectors (normalize).

$$ V_1 = \frac{V_1}{|V_1|}; V_2 = \frac{V_2}{|V_2|} \in \Bbb R^3$$

And the problem is we have a vector b which is outside of the plane spanned by V1 and V2. We want to find the best projection of b onto the plane spanned by V1 and V2, which we call p, such that the difference |b-p| is minimized.

We take the cross product to find a third vector going "upwards" (from the plane)

$$ V_3 = \frac{V_1 \times V_2}{|V_1 \times V_2|}$$

Then we take the cross product between V1 and V3 to find a "better V2" to complete an orthonormal basis in R^3

$$V_2^* = \frac{V_1 \times V_3}{|V_1 \times V_3|}$$

We dump the vectors into a transformation matrix A

$$ A = \begin{bmatrix} V_1 & V_2^* & V_3 \end{bmatrix}$$

Now I can transform the vector b by the inverse of A such that it is "tilted" into a distorted space where I can "pretend" that V1 and V2 span a plane perfectly aligned with the standard vector axis (1, 0, 0) and (0, 1, 0), and the up vector is actually up (0, 0, 1).

$$ b^* = A^{-1} \cdot b$$

And then, by ignoring the "extra" component of b* I can transform the vector "back" to the original coordinates to find the projection. But the problem is I don't know how to generate this "half-assed identity matrix" such that it discards exactly the same component I made up with the cross product pointing away from the plane. What I mean is this:

$$\begin{bmatrix}1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 0 \end{bmatrix} \cdot b^*$$

Because by ignoring the made-up component I am basically projecting the point onto the plane as if the plane was perfectly aligned with the other two axes and there was no "distortion" on the space, so that the projection of b onto the plane is simply found by ignoring the extra component which brings it outside of the plane. (I truly don't know if this makes sense I am just starting with linear algebra, please let me know if I am talking non-sense). And then you transform the projection "back to the original coordinate system" by multiplying it with A (since A is the matrix which brings the standard axis vectors (1, 0, 0) and (0, 1, 0) back to the basis spanning $c_1*V_1 + c_2*V_2^*$).

The complete solution I came up with to workout the projection p $ \in \Bbb R^2 $ of b $ \in \Bbb R^3 $ is:

$$ p = A \cdot \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 0 \end{bmatrix} \cdot (A^{-1} \cdot b)$$

I have no idea how to manipulate these variables in a way which I could generate that "almost identity" matrix on the fly without needing to hardcode it in. And I also have no idea if any of what I said above makes sense or if it is all idiotic, I just know I plugged this into a python code and compared it with the results from WolframAlpha (there is the command "project vector b onto {v1, v2}") and the results always match 100%.

Edit: I just found some cases when my results don't match. They match most of the time I haven't understood why some don't. But I just noticed that if b is a vector (0, 0, anything) it doesn't work.

Best Answer

Your approach is a very nice start. I would summarize the steps you are taking as follows:

  1. Write everything in a basis where your plane $p$ is the x-y plane in $\mathbb{R}^3$
  2. Use a projection map to map $b$ to the x-y plane.
  3. Do the inverse change of basis to get back to the original basis.

This method works, but as you noted it requires you to 'hard code' the projection map to the x-y plane. So how could we refine your technique to get a co-ordinate independent formula? Well suppose we are working in the co-ordinates where $p$ is the x-y plane. Then the vertical component of $b$ is given by the dot product with $(0,0,1)$: $(b \cdot e_3)e_3$ (where $e_3 = (0,0,1)$) right? So if we want to get rid of the vertical component we can define $b' = b - (b\cdot e_3)e_3.$ This formula gives you the result of the "half-assed identity", which could more properly be called the "projection to the x-y plane".

Now in the general case, instead of transforming to new co-ordinates and taking the dot product with $e_3$, we can simply take the dot product with our normal vector (the one going 'upwards' from the plane) which you called $V_3$. So the projection formula would be $b' = b - (b\cdot V_3) V_3$.