Sturm-Liouville Completeness Proof

ordinary differential equationssturm-liouville

I am looking for a basic but rigorous introduction to Sturm-Liouville theory. In particular, I would like to see a proof that the eigenfunction solutions of Sturm-Liouville problems are complete.

Some books seem to suggest that this follows from the completeness of eigenvectors of finite-dimensional Hermitian operators in linear algebra, but that can't be right. All the linear algebra proofs I can find use the fact that the (finite) number of eigenvectors of a Hermitian matrix equals the (finite) dimension of the matrix, and this result cannot carry over into infinite-dimensional function space.

Best Answer

The classic book on this subject is E. C. Titchmarsh's 1942 text, Eigenfunction Expansions Associated with Second Order Differential Equations -- Part I. This was written by a master who studied under G. H. Hardy. Titchmarsh was the "Savilian Professor of Geometry in the University of Oxford."

Titchmarsh starts from basic Complex Analysis and Advanced Calculus, and by page 13 he has stated the completeness theorem for regular expansions. In the next few pages, he proves that the regular expansions are complete by showing that they are asymptotically the same as ordinary Fourier expansions, with specific bounds. He makes this topic look easy, which it is not. He also deals with the singular problems, the spectrum, and spectral measures. The heart of Titchmarsh's analysis relies on Complex Analysis, of which he was a master. Much of what is found in this text is his original research. Titchmarsh is still referenced by those in the field.

Basis Expansions

In linear algebra, you think about expanding a vector $x$ into a basis. What does this mean? Well, if $x\in V$ and $V$ is a finite dimensional vector space (say $\text{dim}(V) = n$) we can choose a special set of vectors $e_1,\ldots,e_n$ such that for any $x\in V$, there are coefficients $\alpha_1,\ldots,\alpha_n$ such that

$$ x = \sum_{j=1}^n\alpha_j e_j $$ In functional analysis, you can usually do the same thing. Given a function $f\in X$, where $X$ is now a vector space of functions, one can usually find a special infinite sequence $e_1(x),e_2(x),\ldots$ such that for any $f\in X$, there are coefficients $\alpha_1,\alpha_2,\ldots$ such that

$$ f(x) \stackrel{X}{=} \sum_{j=1}^\infty\alpha_j e_j(x) $$ Since this is an infinite sum, you have to be much more careful about what you mean by $=$, hence putting an $X$ on top to indicate that this is `equality in the sense of the vector space $X$'. This becomes more precise when you talk about norms, inner products, and metrics. An example would be the so-called Hilbert space $L^2([0,1])$, where equality means

$$ \lim_{n\rightarrow\infty} \int_0^1\left\vert f(x) - \sum_{j=1}^n \alpha_j e_j(x)\right\vert ^2 dx = 0 $$ Other examples obviously exist as well, but I won't go into it. Since it sounds like you're interested in Fourier series, you'll be happy to know that this is exactly what I'm talking about: if you take

$$ e_k(x) = \exp(2\pi i kx) $$, then any $f\in L^2([0,1])$ will have coefficients $\alpha_{\pm 1},\alpha_{\pm 2},\ldots$ such that $ f= \sum_{k=-\infty}^\infty \alpha_k\exp(2\pi ikx)$, where again $=$ means "in the $L^2$ sense".

Just like in finite dimensions, a special role is played by orthonormal bases in infinite dimensions. In finite dimensions, if you have a basis $e_1,\ldots,e_n$ that is orthonormal, then you can always write down the expansion coefficients as dot products:

$$ x = \sum_{j=1}^n (x\cdot e_j) e_j $$ In some function spaces, you have a generalization of the dot product called an inner product, written usually as $\langle f,g\rangle$. While there are many inner products, the most popular ones tend to be the $L^2$ inner products:

$$ \langle f,g\rangle_{L^2([a,b])} = \int_a^b f(x)\overline{g(x)} dx $$ Then, if $e_1,\ldots$ is an orthonormal basis for $L^2([a,b])$, you can write

$$ f(x) \stackrel{L^2}{=} \sum_{j=1}^\infty \langle f,e_j\rangle e_j(x) $$ Some times you don't have an inner product, so "orthonormal basis" does make sense (I have no way to measure angles). But, you can still do basis expansions, it just might be harder to compute the expansion coefficients. There's whole books on this subject - see e.g. A Primer on Basis Theory by Heil or Frames and Bases by Christensen.

Linear Operators

In linear algebra, you work a lot with matrices, which are representations of linear operators $L:V\rightarrow W$ where $V$ and $W$ are two vector spaces with dimension $n$ and $m$. The matrix is a representation because we must choose bases for $V$ and $W$ in order to define it! Assuming you've selected bases, you can then write down a matrix $A$ such that

$$ (A\alpha)_i = \sum_{j=1}^n A_{ij}\alpha_j,\quad 1\leq i\leq m $$ where $\alpha$ is the vector of coefficients representing $x\in V$. This generalizes to functions in the same way, except you will need an infinite matrix:

$$ (A\alpha)_i = \sum_{j=1}^\infty A_{ij} \alpha_j,\quad 1\leq i\leq m $$ Here $m$ can either be finite or infinite, depending on the dimension of $W$.

Another perspective on linear operators comes from thinking about generalizing matrix-vector products from sums to integrals. Loosely speaking, you can imagine that as $n\rightarrow \infty$, you might have

$$ \sum_{j=1}^n A_{ij}\alpha_j \longrightarrow \int_{0}^1 A(x,y) f(y) dy $$ in the appropriate sense. Here $A(x,y)$ is now a function, so like you suspect, matrices turn in to functions. This perspective is extremely useful, as it comes up in the theory of Green's functions, finite elements, and integral equations.

One thing I will mention is that issues of domain and range become much more subtle. Whereas in finite dimensions it is fairly simple to talk about the domain and range of a linear operator, you can have issues with infinite dimensional operators having to do with boudnedness. For example, the derivative operator $L = \frac{d}{dx}$ is a very important linear operator to study. However it is "unbounded" on most "standard" spaces of functions. This is essentially because we can have "small" functions that get extremely "big" after differentiation - take $f(x) = \epsilon \sin(x/\epsilon^2)$, where $\epsilon$ is a very small number for instance. $f(x)$ is very "small" because it has a very small amplitude, but $f^\prime(x)=\frac{1}{\epsilon}\cos(x/\epsilon^2)$ is very "large".

Eigenvalues and Eigenvectors

In linear algebra, you ask the question as to whether for a linear operator $L$ there exist special $v_j\in V$ and $\lambda_j\in \Bbb{C}$ such that

$$ Lv_j = \lambda_jv_j $$ You can do the same in functional analysis, with a few extra technical details (for instance, instead of finitely many eigenvalues or even countably infinitely many, you can have an entire interval of eigenvalues). Eigenvalues are still called eigenvalues, but eigenvectors are usually called eigenfunctions. A great example is the differentiation operator: if $L = \frac{d}{dx}$, without getting too technical, you can think about any exponential function $v(x) = \exp(\lambda x)$ as being an eigenfunction of $L$ with eigenvector $\lambda$, since

$$ (Lv)(x) = \lambda v(x) $$ Furthermore, the spectral decomposition of a Hermitian matrix (Hermitian means A = A^\dagger), usually written $A = Q\Lambda Q^\dagger$, where $\Lambda$ is a diagonal matrix and $Q$ is a unitary matrix, turns into something called the spectral theorem of self-adjoint operators, which states that any bounded "self-adjoint" (also a symmetry condition) operator $L$ will have a decomposition of the form $L = U T_\lambda U^\dagger$ where $U$ is a unitary operator and $T_\lambda$ is a multiplication operator (the generalization of a diagonal matrix). An example of this is the relationship between convolution and the Fourier transform. Say we define a linear operator $L$ as follows:

$$ (L_kf)(x) = \int_{-\infty}^\infty k(x-y)f(y)dy $$ Then, with some conditions on $k$, one can prove that this operator is bounded and self-adjoint, and furthermore, we have a "spectral decomposition" of $L$ of the form

$$ L = \mathcal{F}^*T_\lambda \mathcal{F} $$ where $\mathcal{F}$ is the (unitary) Fourier transform:

$$ (\mathcal{F}f)(\xi) = \int_{-\infty}^\infty f(x)\exp(-2\pi i \xi x) dx $$ This is sometimes written as

$$ (Lf)(x) = \mathcal{F}^{-1} [\hat{k}(\xi)\hat{f}(\xi)] $$ In other words, the eigenvalues of $L_k$ are given by $\hat{k}(\xi)$ (a continuum of them!) and the eigenfunctions of $L_k$ are the complex exponentials.

[Math] Can Sturm-Liouville theory actually solve ODEs

Here are two reasons why Sturm-Liouville theory is useful:

Physics reason: I noticed that you tagged this question as "physics". Almost every problem in quantum physics is a Sturm-Liouville eigenfunction problem: namely, solving Schrodinger's equation!

For example:

Take the infinite square well. Here, Schrodinger's equation is of the form $ \mathcal L u + \lambda u = 0$, where $\mathcal L = \frac {d^2}{dx^2}$ (up to a constant factor). The boundary conditions $u(0) = u(\pi) = 0$. This is a Sturm-Liouville eigenfunction problem. The eigenfunctions $u_n = \sin(n x)$ are the wavefunctions of the various states, and the eigenvalues $\lambda_n = n^2$ are their energies. (The allowed values of $n$ are $n = 1,2,3,\dots$)
For the simple harmonic oscillator, Schrodinger's equation can be converted to the Hermite equation, which is also of Sturm-Liouville form. The eigenfunctions and eigenvalues are (closely related to) the wavefunctions and energies of the simple harmonic oscillator states. (This example is a "singular" Sturm-Liouville system, because the domain extends to infinity.)
When you solve Schrodinger's equation for the hydrogen atom by separation of variables, the radial equation can be converted into the associated Laguerre's equation, which is another (singular) Sturm-Liouville system...

Sturm-Liouville theory tells us (with certain caveats relating to dimensionality, infinite domains and singularities, and with subtleties depending on how closely-related the Sturm-Liouville problem is to the original Schrodinger problem) that:

There is a discrete sequence of energy levels, labelled by a quantum number $n$. This discreteness is why quantum physics is called "quantum". The energy values of the states form an ascending sequence $\lambda_1 \leq \lambda_2 \leq \lambda_3 \leq \dots$, which tends to infinity. So in particular, there is a well-defined notation of a "ground state" of minimal energy: this is the $n = 1$ state.
The wavefunctions of two energy levels with distinct energy values are orthogonal. This is an easy consequence of the self-adjointness property of the Hamiltonian operator. If we rescale the wavefunctions (and choose bases carefully if certain energy levels are degenerate), then we can make the wavefunctions orthonormal: i.e. $\int u_{n_1} u_{n_2} = \delta_{n_1, n_2}$.
Any general wavefunction $u$ can be written as a linear combination $ u = \sum_n c_n u_{n} $ of the pure energy-level wavefunctions $u_{n}$. This is the "completeness" statement in Sturm-Liouville theory. In physics language, the state $u$ is then a "superposition" of the pure energy-level states.

Moreover, it is easy to work out the coefficients $c_n$ in the linear decomposition $ u = \sum_n c_n u_{n} $. Since we have an orthogonality relation $\int u_{n_1} u_{n_2} = \delta_{n_1 , n_2}$, the $c_n$ coefficient is simply $c_n = \int u_{n} u$.

Maths reason: Suppose we wish to solve an equation of the form $$ \mathcal L u = f,$$ where $\mathcal L$ is a second-order differential operator of Sturm-Liouville type, and $f$ is some given function. (For example, this could be a dynamical system, where $f$ represents a forcing term.)

A common strategy for solving an equation of this form is to start by finding a set of eigenvalues $\lambda_n$ and eigenfunctions $u_n$ satisfying the equation, $$ \mathcal L u_n + \lambda_n w u_n = 0,$$ You're right in saying that Sturm-Liouville theory doesn't actually help you to find these eigenvalues and eigenfunctions. Usually, people determine them using other methods, such as power series. [Many of these power series solutions to Sturm-Liouville eigenfunction problems are well known - perhaps you can read up on Legendre polynomials, Hermite polynomials, Laguerre polynomials, Chebyshev polynomials, and so on.]

But once we have found the $\lambda_n$'s and $u_n$ satisfying the eigenfunction equation, we can use these eigenvalues and eigenfunctions, together with results from Sturm-Liouville theory, to build solutions to the original equation $\mathcal L u = f$. Here is how we do it:

First, Sturm-Liouville theory tells us that the $u_n$'s are complete. So we can write the solution $u$ of our original equation as a linear combination of the $u_n$'s: $$ u = \sum_n c_n u_n.$$

If we now substitute this ansatz into the equation $\mathcal L u = f$, we get $$- \sum_n c_n \lambda_n w u_n = f$$

Sturm-Liouville theory also tells us that the $u_n$'s are orthonormal (after rescaling): $$\int w u_{n_1} u_{n_2} = \delta_{n_1, n_2}.$$ [By a standard integration-by-parts argument, this orthonormality property follows from the fact that a differential operator of the form $\mathcal L = \frac d {dx} p \frac d {dx} + q$ is self-adjoint.]

So if we multiply both sides by $u_k $ and integrate, we get $$ c_k = - \tfrac 1 {\lambda_k}\int f u_k .$$

We have now solved the equation! The answer is $$ u = \sum_n \left( - \tfrac 1 {\lambda_n}\int f u_n \right) u_n.$$ [If you like, you can think of $\sum_n \tfrac 1 {\lambda_n }u_n(x')u_n(x)$ as the Green's function for the operator $\mathcal L$.]

To summarise, Sturm-Liouville theory doesn't tell us how to find the eigenfunctions and eigenvalues that satisfy $\mathcal L u + \lambda w u = 0$; this is done usually using power series methods. Instead, Sturm-Liouville theory tells us information about these eigenfunctions and eigenvalues that enable us to use them as building blocks for building solutions to general problems of the form $\mathcal L u = f$.

[This technique can be generalised for equations of the form $ \mathcal L u + \mu w u = f,$ where $\mu$ is a constant number. As along as $\mu$ is not equal to any $\lambda_n$, one can use the same method that the solution is $ u = \sum_n \left( \tfrac 1 {\mu - \lambda_n}\int f u_n \right) u_n.$ Since Sturm-Liouville theory tells us that the $\lambda_n$'s form a discrete set (more specifically, a positive, strictly-increasing sequence), it follows that this solution method is valid for "almost" every choice of $\mu$.]

Best Answer

Related Solutions

[Math] What are all the generalizations needed to pass from finite dimensional linear algebra with matrices to fourier series and pdes

Basis Expansions

Linear Operators

Eigenvalues and Eigenvectors

[Math] Can Sturm-Liouville theory actually solve ODEs

Related Question