[Math] Why is the definition of inner product the way it is

definitioninner-products

I have recently started looking at inner products. However, I'm still struggling to intuitively understand the definition.

I am aware that in most cases we generalize concepts because they are useful to do certain things. For example, once we realize that properties like convergence and continuity depend on the notion of distance it seems useful to generalize thr concept. Similarly, norms generalizes the idea of magnitude.

In order to generalize something we need to figure out the defining properties of the concept we want to generalize. For example, for the distance between two points:

(i) It is non-negative and only $0$ if the points are the same.

(ii) It is symmetric.

(iii) The shortest path between two points is a line between them.

This is basically the definition of a metric in words rather than using formulas

However, I can't seem to make sense of the definiton of inner products. I know that the algebraic definition of the n-dimensional scalar product

$$ a \cdot b = \sum_{i=1}^n a_1 b_1+\cdots+a_n b_n$$

originally comes from quaternion multiplication and that there is also a geometric definition using the law of cosines, i.e.

$$ a \cdot b = \|a\|\|b\| \cos(\theta)$$ where $\theta<180^\circ$ is the angle between the vectors. This means that it can be used to determine if two vectors are orthogonal which seems to be the reason why one would like to generalize this concept after all (see for example here or here.

The (real) inner product is defined as

(1) $\langle x+y,z\rangle=\langle x,z\rangle+\langle y,z \rangle$

(2) $\langle \alpha x,y\rangle=\alpha\langle x,y\rangle$

(3) $\langle x,y\rangle=\langle y,x\rangle$

(4) $\langle x,x\rangle\geq 0$, $\langle x,x\rangle=0 \iff x=0$

The properties (2) and (3) seem intuitive to me. (3) only says that the relation is symmetric, i.e. if x is orthogonal to y, then y is also orthogonal to x. (2) says that if two vectors are orthogonal to each other, then scaling one of them does not change this fact. However, I can't see the intuition for (1) and (4).

Can someone please shed some light on this?

Edit: By looking at all the answers it became clear that the scalar product should be seen as a scaled version of the scalar projection of $a$ onto $b$.

Best Answer

Here's how you can motivate the coordinate definition of the dot product in $\Bbb R^2$ (for starters) by wanting $x\cdot y = 0$ when $x$ and $y$ are orthogonal. Recall from basic high school geometry that orthogonal lines have slopes that are negative reciprocals of one another (you can get this from basics of similar triangles). So, when $x_1$ and $y_2$ are nonzero (i.e., when the lines are neither horizontal nor vertical), orthogonality is equivalent to $$\frac{x_2}{x_1} = -\frac{y_1}{y_2},$$ which in turn yields $x_1y_1+x_2y_2 = 0$. (And, of course, this formula works fine in the horizontal/vertical case.) Now you get bilinearity and all the rest of the properties, and this shows that $\sum x_iy_i$ is an interesting quantity to study ... :)