[Math] Why do non-orthogonal basis functions encode ‘redundant’ information in transforms

fourier transformorthogonalitywavelets

I'm currently learning about wavelets and keep running into this idea of redundancy involved with the continuous wavelet transform. What I have gathered so far is that there is some 'commonality' between the wavelets in the continuous wavelet transform. Thus, by doing the full transform, you actually collect redundant information.

However, I'm not sure if I fully understand this. Why is it that there was never this topic of redundancy when doing the fourier transform? Well I will guess that the answer to that is due to the fact that all sinusoidal functions of different frequencies are orthogonal to one another. Thus when you do the fourier transform, you don't encode redundant information.

So I've concluded that by having orthogonal functions, there becomes no redundancy of encoding information regarding the signal. However, I'm looking for an 'Explain Like I'm Five' type explanation of why orthogonal basis functions remove redundancy.

For some reason I have a feeling that the entire concept could be explained in 2D space using the concept of projections.. but i'll wait to see.

Best Answer

Listen to your gut.

Let’s look at a pair of linearly independent unit vectors $\mathbf u$ and $\mathbf v$ in $\mathbb R^2$. (They don’t really have to be unit vectors, but omitting all of the normalization factors that would otherwise be necessary reduces clutter.)

enter image description here

If $\mathbf v$ is not orthogonal to $\mathbf u$, then they overlap: there’s a component of $\mathbf v$ that’s parallel to $\mathbf u$, i.e., $\mathbf v$ contains a redundant non-zero scalar multiple of $\mathbf u$. Similarly, $\mathbf u$ has a redundant $\mathbf v$-component.

If we have an orthonormal basis $(\mathbf u,\mathbf v)$ of $\mathbb R^2$, we can express a vector $\mathbf w$ as a linear combination of the basis vectors via orthogonal projection: $$\mathbf w=\pi_{\mathbf u}\mathbf w+\pi_{\mathbf v}\mathbf w=(\mathbf u\cdot\mathbf w)\mathbf u+(\mathbf v\cdot\mathbf w)\mathbf v.$$ If we try to do this with non-orthogonal basis vectors, however, it doesn’t work.

enter image description here

The problem is that those overlaps between $\mathbf u$ and $\mathbf v$ are overcounted when we add up the individual projections. The red vector in the above diagram is the redundant contribution of the orthogonal projection onto $\mathbf v$, and the blue is a redundant contribution from projection onto $\mathbf u$.

The same thing occurs when the vectors are functions instead of elements of $\mathbb R^2$. If the basis vectors aren’t orthogonal, then they overlap to some degree and individual orthogonal projections onto them contribute redundant elements to the sum.

The Gram-Schmidt process finds and eliminates such redundancies among a set of vectors. Another way to eliminate them is to change the direction of projection so that it’s parallel to the other basis vector. In higher-dimensional spaces, that can be generalized to projecting parallel to the given basis vector onto its complement and then subtracting that projection from the original vector, but that’s pretty much what you do when applying the Gram-Schmidt process.

Related Question