It may be helpful to understand the similar fact about left and right eigenvectors of a matrix, in order to appreciate the role being played here by self-adjointness of the operator $L$.
In any case the computation is straightforward, applying what we know in the case that $u,v$ are eigenfunctions (satisfying the unspecified but requisite homogeneous boundary conditions) corresponding to distinct eigenvalues $\lambda_1 \neq \lambda_2$ respectively:
$$ L(u) = -\lambda_1 \sigma u $$
$$ L(v) = -\lambda_2 \sigma v $$
where multiplier function $\sigma(x)$ on $[a,b]$ must be assumed positive (a.e.) there if we want the natural bilinear function to be an inner product, with the corresponding sense of "orthogonality":
$$ 0 = \int_a^b u L(v) - v L(u) \mathrm{d} x
= (\lambda_1 - \lambda_2) \int_a^b u v \; \sigma(x) \mathrm{d} x$$
Since the eigenvalues are unequal, the last integral is zero, and thus $u,v$ are orthogonal with respect to this weighted inner product on $[a,b]$.
Here are two reasons why Sturm-Liouville theory is useful:
Physics reason: I noticed that you tagged this question as "physics". Almost every problem in quantum physics is a Sturm-Liouville eigenfunction problem: namely, solving Schrodinger's equation!
For example:
Take the infinite square well. Here, Schrodinger's equation is of the form
$ \mathcal L u + \lambda u = 0$, where $\mathcal L = \frac {d^2}{dx^2}$ (up to a constant factor). The boundary conditions $u(0) = u(\pi) = 0$. This is a Sturm-Liouville eigenfunction problem. The eigenfunctions $u_n = \sin(n x)$ are the wavefunctions of the various states, and the eigenvalues $\lambda_n = n^2$ are their energies. (The allowed values of $n$ are $n = 1,2,3,\dots$)
For the simple harmonic oscillator, Schrodinger's equation can be converted to the Hermite equation, which is also of Sturm-Liouville form. The eigenfunctions and eigenvalues are (closely related to) the wavefunctions and energies of the simple harmonic oscillator states. (This example is a "singular" Sturm-Liouville system, because the domain extends to infinity.)
When you solve Schrodinger's equation for the hydrogen atom by separation of variables, the radial equation can be converted into the associated Laguerre's equation, which is another (singular) Sturm-Liouville system...
Sturm-Liouville theory tells us (with certain caveats relating to dimensionality, infinite domains and singularities, and with subtleties depending on how closely-related the Sturm-Liouville problem is to the original Schrodinger problem) that:
There is a discrete sequence of energy levels, labelled by a quantum number $n$. This discreteness is why quantum physics is called "quantum". The energy values of the states form an ascending sequence $\lambda_1 \leq \lambda_2 \leq \lambda_3 \leq \dots$, which tends to infinity. So in particular, there is a well-defined notation of a "ground state" of minimal energy: this is the $n = 1$ state.
The wavefunctions of two energy levels with distinct energy values are orthogonal. This is an easy consequence of the self-adjointness property of the Hamiltonian operator. If we rescale the wavefunctions (and choose bases carefully if certain energy levels are degenerate), then we can make the wavefunctions orthonormal: i.e. $\int u_{n_1} u_{n_2} = \delta_{n_1, n_2}$.
Any general wavefunction $u$ can be written as a linear combination $ u = \sum_n c_n u_{n} $ of the pure energy-level wavefunctions $u_{n}$. This is the "completeness" statement in Sturm-Liouville theory. In physics language, the state $u$ is then a "superposition" of the pure energy-level states.
Moreover, it is easy to work out the coefficients $c_n$ in the linear decomposition $ u = \sum_n c_n u_{n} $. Since we have an orthogonality relation $\int u_{n_1} u_{n_2} = \delta_{n_1 , n_2}$, the $c_n$ coefficient is simply $c_n = \int u_{n} u$.
Maths reason: Suppose we wish to solve an equation of the form
$$ \mathcal L u = f,$$
where $\mathcal L$ is a second-order differential operator of Sturm-Liouville type, and $f$ is some given function. (For example, this could be a dynamical system, where $f$ represents a forcing term.)
A common strategy for solving an equation of this form is to start by finding a set of eigenvalues $\lambda_n$ and eigenfunctions $u_n$ satisfying the equation,
$$ \mathcal L u_n + \lambda_n w u_n = 0,$$
You're right in saying that Sturm-Liouville theory doesn't actually help you to find these eigenvalues and eigenfunctions. Usually, people determine them using other methods, such as power series. [Many of these power series solutions to Sturm-Liouville eigenfunction problems are well known - perhaps you can read up on Legendre polynomials, Hermite polynomials, Laguerre polynomials, Chebyshev polynomials, and so on.]
But once we have found the $\lambda_n$'s and $u_n$ satisfying the eigenfunction equation, we can use these eigenvalues and eigenfunctions, together with results from Sturm-Liouville theory, to build solutions to the original equation $\mathcal L u = f$. Here is how we do it:
First, Sturm-Liouville theory tells us that the $u_n$'s are complete. So we can write the solution $u$ of our original equation as a linear combination of the $u_n$'s:
$$ u = \sum_n c_n u_n.$$
If we now substitute this ansatz into the equation $\mathcal L u = f$, we get
$$- \sum_n c_n \lambda_n w u_n = f$$
Sturm-Liouville theory also tells us that the $u_n$'s are orthonormal (after rescaling): $$\int w u_{n_1} u_{n_2} = \delta_{n_1, n_2}.$$
[By a standard integration-by-parts argument, this orthonormality property follows from the fact that a differential operator of the form $\mathcal L = \frac d {dx} p \frac d {dx} + q$ is self-adjoint.]
So if we multiply both sides by $u_k $ and integrate, we get
$$ c_k = - \tfrac 1 {\lambda_k}\int f u_k .$$
We have now solved the equation! The answer is
$$ u = \sum_n \left( - \tfrac 1 {\lambda_n}\int f u_n \right) u_n.$$
[If you like, you can think of $\sum_n \tfrac 1 {\lambda_n }u_n(x')u_n(x)$ as the Green's function for the operator $\mathcal L$.]
To summarise, Sturm-Liouville theory doesn't tell us how to find the eigenfunctions and eigenvalues that satisfy $\mathcal L u + \lambda w u = 0$; this is done usually using power series methods. Instead, Sturm-Liouville theory tells us information about these eigenfunctions and eigenvalues that enable us to use them as building blocks for building solutions to general problems of the form $\mathcal L u = f$.
[This technique can be generalised for equations of the form
$ \mathcal L u + \mu w u = f,$
where $\mu$ is a constant number. As along as $\mu$ is not equal to any $\lambda_n$, one can use the same method that the solution is
$ u = \sum_n \left( \tfrac 1 {\mu - \lambda_n}\int f u_n \right) u_n.$
Since Sturm-Liouville theory tells us that the $\lambda_n$'s form a discrete set (more specifically, a positive, strictly-increasing sequence), it follows that this solution method is valid for "almost" every choice of $\mu$.]
Best Answer
Suppose you are working on a bounded interval $[a,b]$ with Sturm-Liouville operator $L$ given by $$ Lf = \frac{1}{w(x)}\left[-\frac{d}{dx}\left(p(x)\frac{df}{dx}\right)+q(x)f\right]. $$ Assume that $p,q,w$ are real functions, that $p$ is continuously differentiable, and that $q$, $w$ are continuous on $[a,b]$ with $w > 0$ and $p > 0$ on $[a,b]$. The Sturm-Liouville eigenvalue problem for $L$ is posed with the domain $\mathcal{D}(L)$ restricted to twice continuously differentiable funtions $f$ on $[a,b]$ that satisfy endpoint conditions $$ \cos\alpha f(a)+\sin\alpha f'(a) =0,\\ \cos\beta f(b) + \sin\beta f'(b) = 0. $$ The natural space to consider this problem is in $L^2_{w}[a,b]$, with inner product $$ (f,g)_w = \int_{a}^{b}f(x)\overline{g(x)}w(x)dx. $$ For $f,g\in\mathcal{D}(L)$ (meaning both satisfy the endpoint conditions,) the Lagrange identity gives $$ (Lf,g)_w = (f,Lg)_w. $$ Any time you have a symmetric operator on a complex inner product space, the eigenvalues must be real. To see why, suppose $Lf = \lambda f$. Then $$ \lambda (f,f)_w = (Lf,f)_w = (f,Lf)_w = (f,\lambda f)_w = \overline{\lambda}(f,f)_w. $$ So, either (a) $\lambda=\overline{\lambda}$ or (b) $(f,f)_w=0$, which means $f=0$. The operator $L$ is real, meaning that $\overline{Lf}=L\overline{f}$. Therefore, if $Lf=\lambda f$, then $\lambda$ must be real, and $$ L\overline{f} = \overline{Lf} = \overline{\lambda f}=\lambda\overline{f}. $$ In other words, if $f$ is an eigenfunction, then so is $\overline{f}$, which means that $\Re f$ and $\Im f$ are eigenfunctions, and both $f$, $\overline{f}$ satisfy the same endpoint conditions because the endpoint conditions are formulated with real coefficients. To see that $\{ \Re f,\Im f\}$ must be a linearly dependent set of solutions, start by noticing that if $g$ is a real eigenfunction with $g(a)=g'(a)=0$, then $g$ must be zero by classical uniqueness results for second order ODE's. Therefore, $\Re f$ must satisfy the following for some non-zero scalar $C$: $$ (\Re f)(a) = -C\sin\alpha,\;\; (\Re f)'(a)=C\cos\alpha $$ And the same must be true of $\Im f$, with a different constant $D \ne 0$. It follows that $g=D\Re f - C\Im f$ is a real eigenfunction with $g(a)=g'(a)=0$, which forces $g\equiv 0$. So $\{ \Re f,\Im f\}$ must be a linearly dependent set of functions. The final conclusion is that $f= \rho\Re f$ where $\rho$ is a non-zero complex number, or $f=\rho\Im f$ where $\rho$ is a non-zero complex number. So you can always assume the eigenfunctions are real after multiplying by the correct scalar.