[Math] Can Sturm-Liouville theory actually solve ODEs

calculuslinear algebraordinary differential equationsphysicssturm-liouville

My teacher talked about Sturm-Liouville theory, and we learned that any second order differential equation can be put into the self-adjoint form. What is this? Well, the book says is something in the form:

$$\frac{d}{dx}\left[p(x)\frac{du(x)}{dx}\right] + q(x)u(x) \tag {1}$$

The book also says that we can put any second order linear ODE into the self-adjoint form:

Consider the second order linear ODE:

$$Lu(x) = p_0\frac{d^2}{dx^2}u(x) + p_1(x)\frac{d}{dx}u(x) + p_2(x)u(x)$$

Multiply it by:

$$\frac{1}{p_0(x)}\exp\left[\int^x\frac{p_1(t)}{p_0(t)}\ dt\right]$$

to obtain:

$$\frac{1}{p_0(x)}\exp\left[\int^x\frac{p_1(t)}{p_0(t)}\ dt\right]Lu(x) = \frac{d}{dx}\left\{\exp\left[\int^x\frac{p_1(t)}{p_0(t)}\ dt\right]\frac{du(x)}{dx}\right\} + \frac{p_2(x)}{p_0(x)}\cdot \exp\left[\int^x\frac{p_1(t)}{p_0(t)} \ dt\right]u$$

Which is in the self-adjoint form, whatever that means. We can see by comparing with eq $(1)$ and taking $p(x) = \exp\left[\int^x\frac{p_1(t)}{p_0(t)}\ dt\right]$ and $q(x) = \frac{p_2(x)}{p_0(x)}\cdot \exp\left[\int^x\frac{p_1(t)}{p_0(t)} \ dt\right]$.

Then the book, and also my teacher, says that we're interested in finding eigenfunctions and eigenvalues for:

$$\frac{d}{dx}\left[p(x)\frac{du(x)}{dx}\right] + q(x)u(x) + \lambda w(x) u(x) = \\ Lu(x) + \lambda w(x) u(x) $$

for some density function $w(x)$ (what), where $\lambda$ is a eigenvalue related to the eigenfunction $u(x)$, called in the book $u_{\lambda}$ because it depends on $\lambda$.

The book and my teacher then talk about completeness of eigenvalues and eigenfunctions, and that's it. Nothing more is said about Sturm-Liouville theory, then we jumped to other things and I'm asking for what all this is used.

It also mentions boundary conditions. What are boundary conditions? Are them simply the initial conditions for some ODE? Because the boundary conditions given in the book are of the form

$$a_1 u(a) + a_2 u'(a) = 0\\ b_1 u(a) + b_2 u'(a) = 0$$

but aren't initial conditions things in the form $u'(a) = b, u''(a) = c$?

When I see examples on the wikipedia, they all mention that we already know that such eigenfunction $u(x)$ is a solution with eigenvalue $\lambda$, and also in these slides I see no examples of how to actually finding the eigenvalue and eigenfunction…

So what is the usefulness of the self-adjoint formula and of the sturm-liouville theory in general?

Best Answer

Here are two reasons why Sturm-Liouville theory is useful:

Physics reason: I noticed that you tagged this question as "physics". Almost every problem in quantum physics is a Sturm-Liouville eigenfunction problem: namely, solving Schrodinger's equation!

For example:

  • Take the infinite square well. Here, Schrodinger's equation is of the form $ \mathcal L u + \lambda u = 0$, where $\mathcal L = \frac {d^2}{dx^2}$ (up to a constant factor). The boundary conditions $u(0) = u(\pi) = 0$. This is a Sturm-Liouville eigenfunction problem. The eigenfunctions $u_n = \sin(n x)$ are the wavefunctions of the various states, and the eigenvalues $\lambda_n = n^2$ are their energies. (The allowed values of $n$ are $n = 1,2,3,\dots$)

  • For the simple harmonic oscillator, Schrodinger's equation can be converted to the Hermite equation, which is also of Sturm-Liouville form. The eigenfunctions and eigenvalues are (closely related to) the wavefunctions and energies of the simple harmonic oscillator states. (This example is a "singular" Sturm-Liouville system, because the domain extends to infinity.)

  • When you solve Schrodinger's equation for the hydrogen atom by separation of variables, the radial equation can be converted into the associated Laguerre's equation, which is another (singular) Sturm-Liouville system...

Sturm-Liouville theory tells us (with certain caveats relating to dimensionality, infinite domains and singularities, and with subtleties depending on how closely-related the Sturm-Liouville problem is to the original Schrodinger problem) that:

  • There is a discrete sequence of energy levels, labelled by a quantum number $n$. This discreteness is why quantum physics is called "quantum". The energy values of the states form an ascending sequence $\lambda_1 \leq \lambda_2 \leq \lambda_3 \leq \dots$, which tends to infinity. So in particular, there is a well-defined notation of a "ground state" of minimal energy: this is the $n = 1$ state.

  • The wavefunctions of two energy levels with distinct energy values are orthogonal. This is an easy consequence of the self-adjointness property of the Hamiltonian operator. If we rescale the wavefunctions (and choose bases carefully if certain energy levels are degenerate), then we can make the wavefunctions orthonormal: i.e. $\int u_{n_1} u_{n_2} = \delta_{n_1, n_2}$.

  • Any general wavefunction $u$ can be written as a linear combination $ u = \sum_n c_n u_{n} $ of the pure energy-level wavefunctions $u_{n}$. This is the "completeness" statement in Sturm-Liouville theory. In physics language, the state $u$ is then a "superposition" of the pure energy-level states.

Moreover, it is easy to work out the coefficients $c_n$ in the linear decomposition $ u = \sum_n c_n u_{n} $. Since we have an orthogonality relation $\int u_{n_1} u_{n_2} = \delta_{n_1 , n_2}$, the $c_n$ coefficient is simply $c_n = \int u_{n} u$.

Maths reason: Suppose we wish to solve an equation of the form $$ \mathcal L u = f,$$ where $\mathcal L$ is a second-order differential operator of Sturm-Liouville type, and $f$ is some given function. (For example, this could be a dynamical system, where $f$ represents a forcing term.)

A common strategy for solving an equation of this form is to start by finding a set of eigenvalues $\lambda_n$ and eigenfunctions $u_n$ satisfying the equation, $$ \mathcal L u_n + \lambda_n w u_n = 0,$$ You're right in saying that Sturm-Liouville theory doesn't actually help you to find these eigenvalues and eigenfunctions. Usually, people determine them using other methods, such as power series. [Many of these power series solutions to Sturm-Liouville eigenfunction problems are well known - perhaps you can read up on Legendre polynomials, Hermite polynomials, Laguerre polynomials, Chebyshev polynomials, and so on.]

But once we have found the $\lambda_n$'s and $u_n$ satisfying the eigenfunction equation, we can use these eigenvalues and eigenfunctions, together with results from Sturm-Liouville theory, to build solutions to the original equation $\mathcal L u = f$. Here is how we do it:

First, Sturm-Liouville theory tells us that the $u_n$'s are complete. So we can write the solution $u$ of our original equation as a linear combination of the $u_n$'s: $$ u = \sum_n c_n u_n.$$

If we now substitute this ansatz into the equation $\mathcal L u = f$, we get $$- \sum_n c_n \lambda_n w u_n = f$$

Sturm-Liouville theory also tells us that the $u_n$'s are orthonormal (after rescaling): $$\int w u_{n_1} u_{n_2} = \delta_{n_1, n_2}.$$ [By a standard integration-by-parts argument, this orthonormality property follows from the fact that a differential operator of the form $\mathcal L = \frac d {dx} p \frac d {dx} + q$ is self-adjoint.]

So if we multiply both sides by $u_k $ and integrate, we get $$ c_k = - \tfrac 1 {\lambda_k}\int f u_k .$$

We have now solved the equation! The answer is $$ u = \sum_n \left( - \tfrac 1 {\lambda_n}\int f u_n \right) u_n.$$ [If you like, you can think of $\sum_n \tfrac 1 {\lambda_n }u_n(x')u_n(x)$ as the Green's function for the operator $\mathcal L$.]

To summarise, Sturm-Liouville theory doesn't tell us how to find the eigenfunctions and eigenvalues that satisfy $\mathcal L u + \lambda w u = 0$; this is done usually using power series methods. Instead, Sturm-Liouville theory tells us information about these eigenfunctions and eigenvalues that enable us to use them as building blocks for building solutions to general problems of the form $\mathcal L u = f$.

[This technique can be generalised for equations of the form $ \mathcal L u + \mu w u = f,$ where $\mu$ is a constant number. As along as $\mu$ is not equal to any $\lambda_n$, one can use the same method that the solution is $ u = \sum_n \left( \tfrac 1 {\mu - \lambda_n}\int f u_n \right) u_n.$ Since Sturm-Liouville theory tells us that the $\lambda_n$'s form a discrete set (more specifically, a positive, strictly-increasing sequence), it follows that this solution method is valid for "almost" every choice of $\mu$.]

Related Question