[Math] Visualizing Orthogonal Polynomials

orthogonal-polynomials

Recently I was introduced to the concept of Orthogonal Polynomials through the poly() function in the R programming language. These were introduced to me in the concept of polynomial transformations in order to do a linear regression. Bear in mind that I'm an economist and, as should be obvious, am not all that smart (choice of profession has an odd signaling characteristic). I'm really trying to wrap my head around what Orthogonal Polynomials are and how, if possible, to visualize them. Is there any way to visualize orthogonal polynomials vs. simple polynomials?

Best Answer

Helge presented the continuous case in his answer; for the purposes of data fitting in statistics, one usually deals with discrete orthogonal polynomials. Associated with a set of abscissas $x_i$, $i=1\dots n$ is the discrete inner product

$$\langle f,g\rangle=\sum_{i=1}^n w(x_i)f(x_i)g(x_i)$$

where $w(x)$ is a weight function, a function that associates a "weight" or "importance" to each abscissa. A frequently occurring case is one where the $x_i$ are equispaced, $x_{i+1}-x_i=h$ where $h$ is a constant, and the weight function is $w(x)=1$; for this special case, special polynomials called Gram polynomials are used as the basis set for polynomial fitting. (I won't be dealing with the nonequispaced case in the rest of this answer, but I'll add a few words on it if asked).

Let's compare a plot of the regular monomials $x^k$ to a plot of the Gram polynomials:

monomial versus Gram

On the left, you have the regular monomials. The "bad" thing about using them in data fitting is that for $k$ high enough, $x^k$ and $x^{k+1}$ are nigh-indistinguishable, and this spells trouble for data-fitting methods since the matrix associated with the linear system describing the fit is dangerously close to becoming singular.

On the right, you have the Gram polynomials. Each member of the family does not resemble its predecessor or successor, and thus the underlying matrix used for fitting is a lot less likely to be close to singularity.

This is the reason why discrete orthogonal polynomials are of interest in data fitting.

Related Question