[Math] Hyperplane definition

convex-analysisgeometry

I am reading the Convex Optimization book from Stephen Boyd and I went back to hyperplane. I don't understand it's definition:

A hyperplane is a set of the form $\{x ~|~a^Tx=b \},~a\in I\!R^n,~x \in I\!R^n,~ b \in I\!R$

Ok, why not. How can you get a plane from this ?

If I refer to another definition of the hyperplane :

Let $a_1,…,a_n$ be scalars not all equal to 0. Then the set S consisting of all vectors $\begin{align}
X &= \begin{bmatrix}
x_{1} \\
x_{2} \\
\vdots \\
x_{n}
\end{bmatrix}
\end{align}$ in $I\!R^n$ such that $a_1x_1+…+a_nx_n = c$, for $c$ a constant, is a hyperplane.
(I suppose that this is in fact a scalar product, or I don't get how you can have a constant).

Comparing both definitions, I suppose that $a^Tx = b$ in Boyd is equivalent to $a_1x_1+…+a_nx_n = c$ in the other definition. But Boyd is also saying that $a$ is the normal vector of the hyperplane. How can $a$ be a normal vector if $a \in I\!R^n$ ? For example, if $a \in I\!R^2$, you will have $a = (x, y)$, where $x, y \in I\!R$, not $I\!R^2$. So how can you say that $a$ is a normal vector to the hyperplane if $a$ is only one point of $I\!R^n$ ?

I am pretty sure I have misunderstand something, so if someone could explain it to me with a simple numercial example, it will be great. I will be able to answer to my questions after that. I am not able to find anything clear on the net about hyperplanes.

Thanks a lot.

Best Answer

Suppose $a \neq 0$. Let $f(x) = a^Tx = \sum_k a_k x_k$, note that $f$ is a linear functional on $\mathbb{R}^n$ and that $\ker f$ is a linear space of dimension $n-1$. Also note that ${\cal R} f = \mathbb{R}$.

In particular, $\ker f$ is a plane in $\mathbb{R}^n$ that passes through the origin.

Hence for any $x_0$ the set $\{ x | f(x) = f(x_0) \}$ is a plane parallel to $\ker f$ that passes through the point $x_0$. Since $f$ is surjective,it follows that $H=\{ x | f(x) = b \}$ is also a plane parallel to $\ker f$ that passes through some (any) point $x_b$ such that $f(x_b) = b$.

Regarding normals, suppose $x_0$ lies on the hyperplane $H$, that is, $f(x_0) = b$. Now pick any other point $x_1 \in H$ and note that $f(x_1) = b$ and so $f(x_1-x_0) = 0$ or $(x_1-x_0) \bot a$. That is, $a$ is perpendicular to any of the directions $x_1-x_0$ with $x_1 \in H$. That is, for any two points $x_0,x_1 \in H$ we have $(x_1-x_0) \bot a$. This is what we mean when we say $a$ is a normal to $H$.