Statistics – Orthogonality of Random Vectors and the Inner Product

complex numbersstatisticsvectors

An acclaimed answer to the question What does orthogonal mean in statistics? is beautifully stripped down to

$$\mathbb E[\mathbf {XY^*}]=0$$

I am not familiar with the operations involved, not just because of the inclusion of complex numbers, but most important, because I don't see the use of the inner product in this context. In this post:

The vector space $\mathscr L_2$ of real-valued random variables on $(\Omega,\mathscr F,\mathbb P)$ (modulo equivalence of course) with finite
second moment is special, because it's the only one in which the norm
corresponds to an inner product. If $X$ and $Y$ are random variables
in $\mathscr L_2$, we define the inner product of $X$ and $Y$ by

$⟨X,Y⟩=\mathbb E[XY]$


In relations to the comments below I find this quote on Wikipedia:

For real random variables $X$ and $Y$, the expected value of their
product $\displaystyle \langle X,Y\rangle :=\operatorname {E} (XY)$ is
an inner product. This definition of expectation as inner product can
be extended to random vectors as well.


The actual hurdle:

Now, this inner product is not the dot product of two vectors, is it? If the idea is to multiply the elements of two random vectors $X$ and $Y$ and then perform the inner product of this multiplication vector with itself, we'd be getting $[XY]^2$ – more like a norm… I think there is more to inner product, included in the quotes I pasted above, and requiring knowledge of abstract algebra. This probably includes the concept of the vector space $\mathscr L_2$.


What I got so far:

Since,

\begin{align}
Cov[X,Y]=&E[(X−E[X])\cdot(Y−E[Y])]\\
&=E[X\cdot Y]−E[X\cdot E[Y]]−E[E[X]\cdot Y]+E[E[X]\cdot E[Y]]\\
&=E[X\cdot Y]−E[X]\cdot E[Y]
\end{align}

and consequently,

$$E[XY]=Cov[X,Y]+E[X]\cdot E[Y],$$

real-valued random variables $X$ and $Y$ are uncorrelated (different from independent) if and only if the centered variables $X-E(X)$ and $Y-E(Y)$ are orthogonal: $E[XY]=0.$

Best Answer

An inner product is a structure in a vector space that introduces 'geometry', specifically the notion of angles, to the space. We have the following 'chain' of structures.

First, a vector space is a set in which we're allowed to add two of its elements (and the result is still in the set!). It has all nice usual properties of addition, like commutativity or a neutral element, the zero vector. A vector space also allows it to multiply its elements by a number (usually a real or complex number), and multiplication also works with addition as usual.

Then, we can introduce a norm into our space. This is a way of measuring vectors, of assigning them a length if you will. The usual norm for $\mathbb{R}^n$ is the euclidean norm $\sqrt{\sum_i {x_i}^2}\,$, but there are plenty of other ways to measure length. For instance, sometimes it's better to use the taxicab (or Manhattan) distance.

Finally, we get into inner products. This is a way of introducing geometry to our space. An inner product assigns to each pair $u,v$ of vectors a number (real or complex) $\langle u,v\rangle$, and it has some fundamental properties:

\begin{align} &\langle u,v\rangle=\overline{\langle v,u\rangle}&\text{(where $\,\overline{\dot{}}\,$ is complex conjugation)}\\ &\langle u,u\rangle \geq 0\\ &\langle u,u\rangle = 0 \Leftrightarrow u=0\\ &\langle t+u,v\rangle=\langle t,v\rangle+\langle u,v\rangle\\ &\langle \alpha \cdot u,v\rangle=\alpha \cdot \langle u,v\rangle &\text{(where $\alpha$ is a scalar)} \end{align}

You are free to check that the usual inner product in $\mathbb{R}^n$ satisfies these properties. That said, like with the norm, there are many other ways to define an inner product, and as long as it satisfies these properties, it is a valid definition.

Moreover, an inner product $\langle \dot{},\dot{}\rangle$ on a vector space induces a norm $\lVert \dot{} \rVert$ via

$$\lVert u \rVert = \sqrt{\langle u,u\rangle}$$

Notice that the usual norm in $\mathbb{R}^n$ is the one obtained from its usual inner product!

Now, you may look at all this and ask 'But what does this have to do with geometry or angles?' The inner product is used to define the angle between vectors, in a manner similar to the one we use for $\mathbb{R}^n$.

$$\cos(u,v)=\frac{\langle u,v\rangle}{\lVert u \rVert \cdot \lVert v \rVert}$$

Like I said before, an inner product introduces geometry into a space, even if it may not be 'naturally' geometric to our own euclidean eyes (or minds -- many of these spaces are abstract). In fact, these structures are abstractions and generalizations of what we investigated about $\mathbb{R}^n$.

With this in mind, the pieces of text in your question refer to the definition of an inner product in the vector space $\mathscr L_2$, whose elements (or vectors) are random variables that take on real values. In particular, they do not take on 'lists' of numbers; they are each a single, real number.

It might be a good if somewhat easy exercise to check the properties of inner products, listed above. Hope this has helped!

Related Question