Why does interpolating a function on an interval using orthogonal polynomials give the best possible approximation in that interval

approximationinterpolationorthogonal-polynomials

In the context of numerical integration, it is often said that quadrature methods which interpolate a function using orthogonal polynomials give the best possible approximation.

I can understand why orthogonal functions are so useful in Fourier analysis, because they let you to take an infinite series and cancel out all the elements except for one, making it possible to calculate the coefficient relative to that term.

I even understand why orthogonality between vectors is useful in least squares approximations, because we know that the shortest path between a given point and a surface is a straight line perpendicular to the surface, i.e., its direction is given by the vector perpendicular to the surface.

Lastly, I have an intuition of what orthogonality between functions means and I know how to perform all the calculations, but what I can't grasp is how I can put all these pieces together and understand why orthogonal polynomials defined in a given interval allow for the best interpolation in that interval

Best Answer

The definition of orthogonal polynomials is based on an underlying inner product in the function space. Considering your setting, this function space consists of function which image corresponds to an interval. You also need to keep in mind that 'best possible approximation' has to be understood w.r.t. to a given norm on an underlying function space. The norm in which you get a best approximation property is the one which is induced by the underlying inner product.

Think about the following setting, assume you have an inner product $(\cdot,\cdot)_\circ$ on a given function space. E.g., the $L^2(a,b)$ inner product $$ (p,q)_2 = \int_a^b \overline{p}(x) q(x) d(x), $$ on the function space of square integrable functions on an interval $(a,b)\subset\mathbb{R}$. Let $\|\cdot\|_\circ$ denote the norm which is induced by the $\circ$-inner product, $$ \|p\|_\circ = (p,p)_\circ^{1/2} $$ Furthermore, let $p_1,p_2,\ldots$ denote a sequence of orthonormal polynomials w.r.t. to the $\circ$-inner product, thus, these poylnomials are orthogonal and normalized, $$ (p_j,p_k)_\circ = \delta_{jk},~~~~j,k\geq 1. $$

Let $f$ be a given function which you want to approximate. Assume $f$ is bounded in the $\circ$-norm, i.e., $\|f\|_{\circ}<\infty$, then we have the representation $$ f(x)=\sum_{j=1}^{\infty} a_j p_j(x), $$ where $a_j$ are coefficients which satisfy $a_j=(p_j,f)_{\circ}\in\mathbb{C}$. The condition that $f$ is bounded in the $\circ$-norm corresponds to the coefficients being bounded as a series, $$ \|f\|_{\circ} = (\sum_{j=1}^{\infty} |a_j| )^{1/2}<\infty. $$ This results from the polynomials $p_j$ being orthonormal.

Now, let us define the truncated series $f_n$ and the remainder $r_n$ as $$ f_n(x)=\sum_{j=1}^{n} a_j p_j(x),~~\text{and}~~ r_n(x)=\sum_{j=n+1}^{\infty} a_j p_j(x). $$ We have $f - f_n = r_n$, and thus, the truncated series approximations $f$ with an error $$ \| f - f_n\|_{\circ} = \|r_n\|_{\circ} = (\sum_{j=n+1}^{\infty} |a_j| )^{1/2}. $$

To show that $f_n$ is the best approximation to $f$ in the $\circ$-norm, we take an arbitrary polynomial $q$ of degree $\leq n$, which satisfies the representation $$ q(x)=\sum_{j=1}^{n} b_j p_j(x), $$ for given coefficients $b_1,\ldots,b_n\in\mathbb{C}$. Due to the orthogonal property of $p_1,p_2,\ldots$, we have $$ (r_n,q)_{\circ}=0, $$ and thus, $$ \| f - (f_n+q)\|_{\circ}^2 =\| r_n - q\|_{\circ}^2 = \| r_n\|_{\circ}^2 + \| q\|_{\circ}^2 > \| r_n\|_{\circ}^2 $$ Which shows that $f_n$ is the best approximation to $f$ in the $\circ$-norm.

You also mention quadrature methods in the first sentence of your question. But something seems to be a little bid mixed up there. You have orthogonal polynomials are often defined based on an integral-based inner product, a best approximation of a function on an interval w.r.t. a given norm, and quadrature methods which approximate the integral itself. Considering quadrature methods, orthogonal polynomials provide quadrature nodes and weights of so called Gaussian quadrature methods. The Gaussian quadrature with $m$ many nodes exactly computes the integral of polynomials up to degree $2m-1$. For any other functions, the Gaussian quadrature does have an error and I am not sure in what sense these quadrature methods are 'best'.

Related Question