You have the right idea, but you started to go down a more difficult path than necessary. You are correct in interpreting the hyperplane equation as the set of all points (well, position vectors of those points) that have the same projection onto $\hat w$. So, start with this projection, as you did: $$\operatorname{Proj}_{\hat w}{\hat x}={\hat w\cdot\hat x\over\|\hat w\|^2}\hat w$$ and rearrange this slightly into $$\left({\hat w\cdot\hat x\over\|\hat w\|}\right){\hat w\over\|\hat w\|}.$$ The right-hand factor is simply a unit vector in the direction of $\hat w$ that I’ll call $\hat u$. Since $\|\hat w\|$ is a scalar, we can absorb it into the dot product in the numerator and write this expression more simply as $$(\hat u\cdot\hat x)\,\hat u.$$ The parenthesized quantity is a scalar whose absolute value is the length of the projection. Its sign tells you whether the projection is in the same direction as $\hat u$, and hence $\hat w$, (positive) or in the opposite direction (negative), so you could think of it as a signed length.
Observe that this parenthesized factor is exactly the first term in your hyperspace equation, so what that equation says is that for every point on the hyperplane, the signed length of the projection of its position vector onto $\hat w$ is equal to the fixed value $d$, i.e., its projection onto $\hat w$ is $d\hat u$. You can also see this by multiplying both sides of the equation by $\hat u$ and then rearranging and simplifying a bit: $$\begin{align}\left({\hat w\cdot\hat x\over\|\hat w\|}\right)\hat u-d\hat u&=0 \\ (\hat u\cdot\hat x)\,\hat u-d\hat u&=0 \\ (\hat u\cdot\hat x)\,\hat u=d\hat u.\end{align}$$ As for the geometric interpretation of the vector $\hat w$, observe first that $d\hat u$ itself satisfies the equation: $$(\hat u\cdot d\hat u)=d(\hat u\cdot\hat u)=d.$$ So, if $\hat x$ is any vector that satisfies the hyperplane equation, then the displacement vector from $d\hat u$ to $\hat x$ lies within the hyperplane, which means that $\hat x-d\hat u$ is parallel to the hyperplane. But $\hat x-d\hat u=\hat x-\operatorname{Proj}_{\hat w}{\hat x}$ is the orthogonal rejection of $\hat x$ from $\hat w$, which is perpendicular to $\hat w$, so $\hat w$ is itself perpendicular to the hyperplane (is normal to it). Finally, since $d\hat u$ is on the hyperplane and is perpendicular to it, we see that in your equation $d$ represents the (signed) distance of the hyperplane from the origin.
As a final observation, let’s move $d$ to the right-hand side and multiply through by $\|\hat w\|$: $$\hat w\cdot\hat x=\text{const}.$$ With this in hand, you can instantly write down an equation for the hyperplane if you have a normal vector $\hat w$ and any known point $x_0$ on the hyperplane. It’s simply $$\hat w\cdot\hat x=\hat w\cdot{\hat x}_0$$ or equivalently, $$\hat w\cdot(\hat x-{\hat x}_0)=0.$$ With the equation in this form, it should be obvious that $\hat w$ is perpendicular to the hyperplane. Incidentally, this gives you another path to deriving the equation. If you know that $\hat w$ is normal to the hyperplane, you could start from this last equation, which just states this normality condition, and work “backwards” by normalizing $\hat w$ and rearranging things.
You know what you want to prove : it is that for all $x = (x_1,...,x_m)\in \mathbb C^m$, we have:
$$
\max_{1 \leq i \leq m} |x_i| \leq \left(\sum_{i=1}^m |x_i|^2\right)^{\frac 12} \leq \sqrt m \max_{1 \leq i \leq m} |x_i|
$$
You have to prove this statement, from the statements that you already know are true, and some definitions.
Let us try to do the first one.
For the first one, we note that $|x_i| \geq 0$ for all $i$, by definition of the absolute value. Consequently, let us fix $1 \leq I \leq m$, then $|x_I|^2 \leq \sum_{i=1}^m |x_i|^2$, since the right hand side is $|x_I|^2$ plus something non-negative. By taking square roots (and noting that (positive) square roots preserve inequality) we get $|x_I| \leq \left(\sum_{i=1}^M |x_i|^2\right)^{\frac 12}$.
Now , the point is that we can choose any $I$ above that we want , since the choice of $I$ did not affect the calculation. Take $I$ such that $|x_I| = \max_{1 \leq i \leq m} |x_i|$,and then the conclusion follows.
For the second one, note that for each $I$ we have $|x_I| \leq \max_{1 \leq i \leq m} |x_i|$, so we get $\sum_{i=1}^m |x_i|^2 \leq m \times (\max_{1 \leq i \leq m} |x_i|)^2$ by applying the inequality for each $|x_I|$ and then summing separately. Now look above at the second inequality, and see how it is similar to the statement I have just written. Can you work out the second inequality from here?
Best Answer
In general for two subspaces $U,W$ we have $\dim(U+W) = \dim(U) + \dim(W) - \dim(U \cap W)$. $N(B)$ and $R(V_{k+1})$ are subspaces of $\mathbb{R}^n$ whose dimensions sum to $n+1$, so $$\dim(\mathbb{R}^n) \ge \dim(N(B) + R(V_{k+1})) = \dim(N(B)) + \dim(R(V_{k+1}) - \dim(N(B) \cap R(V_{k+1})$$ implies $\dim(N(B) \cap R(V_{k+1})) \ge 1$.
The answerer is using the asterisk $*$ to denote conjugate transpose. If you are working only with real numbers, you can just think of it as the transpose $\top$.
It may be more helpful to just write out the SVD of $A$.
$$\|A^\top x\|_2^2 = x^\top A^\top A x = x^\top V \Sigma^\top U^\top U \Sigma V^\top x = x^\top V\Sigma^\top \Sigma V^\top x.$$
$V^\top x$ is a vector whose entries are $v_i^\top x$ for $i=1,\ldots,n$. Note further that because $x \in R(V_{k+1})$, it must be orthogonal to $v_i$ when $i > k+1$, due to orthogonality of $v_1,\ldots, v_n$. So the entries of $V^\top x$ are $v_1^\top x, v_2^\top x, \ldots, v_{k+1}^\top x, 0, \ldots, 0$.
$\Sigma^\top \Sigma$ is a $n \times n$ diagonal matrix with diagonal entries $\sigma_1^2, \sigma_2^2, \ldots$. Thinking about the matrix multiplication $(V^\top x)^\top (\Sigma^\top \Sigma) (V^\top x)$, you will see that this quantity can be written as $\sum_{i=1}^n \sigma_i^2 (v_i^\top x)^2$. Since the addend is zero for $i > k+1$, this equals $\sum_{i=1}^{k+1} \sigma_i^2 (v_i^\top x)^2$.
$\sum_{i=1}^{k+1} (v_i^\top x)^2$ equals $\sum_{i=1}^n (v_i^\top x)^2$ which can be written as $x^\top V V^\top x = x^\top x = \|x\|_2^2=1$.
By definition, $A-A_k = \sum_{i=k+1}^n \sigma_i u_iv_i^\top$, i.e. $A-A_k = U\tilde{\Sigma} V^\top$ where $\tilde{\Sigma}$ is obtained by changing the first $k$ singular values in $\Sigma$ to zero. Recalling that the operator norm $\|M\|_2$ is the largest singular value of $M$, we can simply note that the largest singular value of $A-A_k$ is $\sigma_{k+1}$ to conclude $\|A-A_k\|_2 = \sigma_{k+1}$.