Two vectors $v,w \in \mathbb{R}^n$ are orthogonal iff $v^t w = 0$ where $t$ indicates the transpose. Really, we're using the dot product given by $\langle v, w \rangle = v^t w$.
There is a different notion of orthogonality for matrices. Here's one definition of an orthogonal matrix: $O \in \rm{M}_n(\mathbb{R})$ is orthogonal if $O^t O = I$. Equivalently, this means that the columns of $O$ are orthonormal, i.e., that they are orthogonal and have length $1$.
However, note that only one matrix appears in this second definition. We just say that $A$ is orthogonal or not, not that $A$ is orthogonal to $B$. So this is not the definition used by the author of your book.
The concept that connects the two notions of orthogonality is an inner product. I'll explain what an inner product is, what it means for orthogonality, and how this more-abstract version of orthogonality relates to the one you're familiar with.
To give a technically correct definition of an inner product I would need to define a vector space, but for this problem that might be overkill. Roughly speaking, a vector space is some collection whose elements can be added, subtracted, and multiplied by real numbers. The set of real-valued functions on $\mathbb R$, for instance, is a vector space, since we know how to add, subtract, and scale real-valued functions.
Note: from now on, when I say vector I'll mean an element of a vector space. So a function is a vector in the vector space of real-valued functions.
An inner product takes in two vectors and returns a scalar; in addition, it must satisfy the following axioms. Think of it as a way to multiply two vectors and return a scalar.
- $\langle ax_1+bx_2,y\rangle = a\langle x_1,y\rangle + b\langle x_2,y\rangle$ (linear in first variable [by the next axiom, also linear in the second variable])
- $\langle x,y\rangle=\langle y,x\rangle$ (symmetric)
- $\langle x,x\rangle = 0$ if and only if $x=0$ (positive-definite)
You are already familiar with one type of inner product: the dot product, which is an inner product on the vector space of row vectors. We say that two row vectors $\vec a$ and $\vec b$ are orthogonal if $\vec a\cdot\vec b=0$. This suggests that a way to make sense of orthogonality in general vector spaces:
Two vectors $v$ and $w$ are orthogonal if $\langle v,w\rangle=0$.
Fourier series start to enter the picture when we look at the vector space of continuous, real-valued functions defined on $[-\pi,\pi]$, say. We can make an inner product on this space by defining
$$
\langle f,g\rangle = \int_{-\pi}^{\pi}f(x)g(x)\,dx.
$$
(Do you believe that this is an inner product?) Under this inner product, $\sin(x)$ and $\cos(x)$ are orthogonal functions. All I mean is that
$$
\int_{-\pi}^\pi \cos x\sin x\,dx = 0,
$$
which is true because $\cos x\sin x = \frac12 \sin 2x$. It is possible to show more generally that
$$
\int_{-\pi}^\pi \cos(nx)\cos(mx)\,dx = \int_{-\pi}^\pi \sin(nx)\sin(mx)\,dx = 0
$$
if $n\neq m$, and
$$
\int_{-\pi}^\pi \sin(nx)\cos(mx)\,dx = 0
$$
always. ($m$ and $n$ are integers.)
Here's another example: the functions $1$ and $x$ are orthogonal. So are the functions $x^n$ and $x^m$ whenever $n+m$ is odd (and $n$ and $m$ are nonnegative). Why?
Best Answer
There are two possibilities here:
There's the concept of an orthogonal matrix. Note that this is about a single matrix, not about two matrices. An orthogonal matrix is a real matrix that describes a transformation that leaves scalar products of vectors unchanged. The term "orthogonal matrix" probably comes from the fact that such a transformation preserves orthogonality of vectors (but note that this property does not completely define the orthogonal transformations; you additionally need that the length is not changed either; that is, an orthonormal basis is mapped to another orthonormal basis). Another reason for the name might be that the columns of an orthogonal matrix form an orthonormal basis of the vector space, and so do the rows; this fact is actually encoded in the defining relation $A^TA = AA^T = I$ where $A^T$ is the transpose of the matrix (exchange of rows and columns) and $I$ is the identity matrix.
Usually if one speaks about orthogonal matrices, this is what is meant.
One can indeed consider matrices as vectors; an $n\times n$ matrix is then just a vector in an $n^2$-dimensional vector space. In such a vector space, one can then define a scalar product just as in any other vector space. It turns out that for real matrices, the standard scalar product can be expressed in the simple form $$\langle A,B\rangle = \operatorname{tr}(AB^T)$$ and thus you can also define two matrices as orthogonal to each other when $\langle A,B\rangle = 0$, just as with any other vector space.
To imagine this, you simply forget that the matrices are matrices, and just consider all matrix entries as components of a vector. The two vectors then are orthogonal in the usual sense.