Suppose you are working on a bounded interval $[a,b]$ with Sturm-Liouville operator $L$ given by
$$
Lf = \frac{1}{w(x)}\left[-\frac{d}{dx}\left(p(x)\frac{df}{dx}\right)+q(x)f\right].
$$
Assume that $p,q,w$ are real functions, that $p$ is continuously differentiable, and that $q$, $w$ are continuous on $[a,b]$ with $w > 0$ and $p > 0$ on $[a,b]$. The Sturm-Liouville eigenvalue problem for $L$ is posed with the domain $\mathcal{D}(L)$ restricted to twice continuously differentiable funtions $f$ on $[a,b]$ that satisfy endpoint conditions
$$
\cos\alpha f(a)+\sin\alpha f'(a) =0,\\
\cos\beta f(b) + \sin\beta f'(b) = 0.
$$
The natural space to consider this problem is in $L^2_{w}[a,b]$, with inner product
$$
(f,g)_w = \int_{a}^{b}f(x)\overline{g(x)}w(x)dx.
$$
For $f,g\in\mathcal{D}(L)$ (meaning both satisfy the endpoint conditions,) the Lagrange identity gives
$$
(Lf,g)_w = (f,Lg)_w.
$$
Any time you have a symmetric operator on a complex inner product space, the eigenvalues must be real. To see why, suppose $Lf = \lambda f$. Then
$$
\lambda (f,f)_w = (Lf,f)_w = (f,Lf)_w = (f,\lambda f)_w = \overline{\lambda}(f,f)_w.
$$
So, either (a) $\lambda=\overline{\lambda}$ or (b) $(f,f)_w=0$, which means $f=0$. The operator $L$ is real, meaning that $\overline{Lf}=L\overline{f}$. Therefore, if $Lf=\lambda f$, then $\lambda$ must be real, and
$$
L\overline{f} = \overline{Lf} = \overline{\lambda f}=\lambda\overline{f}.
$$
In other words, if $f$ is an eigenfunction, then so is $\overline{f}$, which means that $\Re f$ and $\Im f$ are eigenfunctions, and both $f$, $\overline{f}$ satisfy the same endpoint conditions because the endpoint conditions are formulated with real coefficients. To see that $\{ \Re f,\Im f\}$ must be a linearly dependent set of solutions, start by noticing that if $g$ is a real eigenfunction with $g(a)=g'(a)=0$, then $g$ must be zero by classical uniqueness results for second order ODE's. Therefore, $\Re f$ must satisfy the following for some non-zero scalar $C$:
$$
(\Re f)(a) = -C\sin\alpha,\;\; (\Re f)'(a)=C\cos\alpha
$$
And the same must be true of $\Im f$, with a different constant $D \ne 0$. It follows that $g=D\Re f - C\Im f$ is a real eigenfunction with $g(a)=g'(a)=0$, which forces $g\equiv 0$. So $\{ \Re f,\Im f\}$ must be a linearly dependent set of functions. The final conclusion is that $f= \rho\Re f$ where $\rho$ is a non-zero complex number, or $f=\rho\Im f$ where $\rho$ is a non-zero complex number. So you can always assume the eigenfunctions are real after multiplying by the correct scalar.
It comes from the Lagrange identity. If you have solutions $y_1$, $y_2$, with eigenvalues $\lambda_1$, $\lambda_2$, respectively, then
$$
((py_1')'+(q+\lambda_1r)y_1)y_2 = 0 \\
y_1((py_2')'+(q+\lambda_2r)y_2)=0
$$
So, if you subtract the second from the first, you get
$$
(py_1')'y_2-y_1(py_2')'=(\lambda_2-\lambda_1)ry_1y_2 \\
\frac{d}{dx}(py_1'y_2-y_1py_2')=(\lambda_2-\lambda_1)ry_1y_2
$$
The last equation is the Lagrange identity. Then, integrating over $[0,1]$ gives
$$
(\lambda_2-\lambda_1)\int_{0}^{1}y_1y_2rdx = p(y_2y_1'-y_2'y_1)|_{0}^{1}.
$$
So, if you have endpoint conditions of the following form for both $y_1,y_2$, then the evaluation terms vanish:
$$
A y_j(a)+B y_j'(a) = 0,\;\; A^2+B^2 \ne 0\\
C y_j(b)+Dy_j'(b) = 0,\;\; C^2+D^2 \ne 0.
$$
And that gives orthogonality of $y_1,y_2$ with respect to the weight function $r$ provided that $\lambda_1\ne\lambda_2$. This is the history of how "orthogonality" with respect to the weight was formulated. This discovery eventually led to the general notion of "inner product" about 70 years later.
Best Answer
Sturm-Liouville equations originally arose out Fourier's separation of variables method of solving partial differential equations. The separation parameter $\lambda$ serves the role of an eigenvalue for an eigenfunction problem. In fact, eigenvector/eigenvalue analysis of matrices came out of studying Sturm-Liouville equations using Fourier's method. Sturm and Liouville isolate a general type of equation that would arise out of Fourier separation problems, and they set out of study this equation, and to show that every function $f$ on the interval of interest $[a,b]$ could be expanded in a Fourier series of eigenfunctions of the Sturm-Liouville problem: $$ -\frac{d}{dx}\left(p(x)\frac{df}{dx}\right)+q(x)f(x)= \lambda w(x)f(x),\\ A_1 f(a)+A_2 f'(a) = 0,\;\;\; B_1 f(b)+B_2 f'(b)=0. $$ For a regular problem where $p$ does not vanish at $a$ or $b$, there are only a discrete number of values $\lambda_1,\lambda_2,\lambda_3,\cdots$ for which there are corresponding solutions $f_n$ that are not identically $0$ and satisfy both endpoint conditions. These are the eigenvalues, and $f_n$ is the eigenfunction that corresponds to $\lambda_n$. For these solutions, it is possible to write any smooth function $g$ on $[a,b]$ in terms of these functions in a Fourier series $g=\sum_{n=1}^{\infty}c_n f_n(x)$ where $c_n$ are constants. The constants are determined by the orthogonality conditions that arise from the equation:
$$ \int_{a}^{b}f_n(x)f_m(x)w(x)dx = 0,\;\;\; n\ne m. $$
In fact, assuming $g=\sum_{n=1}^{\infty}c_n f_n(x)$, you multiply by $f_m(x)w(x)$, integrate, and use the above to obtain the unknown constants $c_m$:
$$ \int_a^b g(x)f_m(x)w(x)dx = \sum_{n=1}^{\infty}c_n \int_{a}^{b}f_m(x)f_n(x)w(x)dx \\ \int_a^b g(x)f_m(x)w(x)dx = c_m\int_a^b f_m(x)^2w(x)dx \\ c_m = \frac{\int_a^b g(x)f_m(x) w(x)dx}{\int_a^b f_m(x)^2w(x)dx} $$ And, in a fairly general sense, you can expand a function $f$ in a Fourier series of the eigenfunctions that has the form $$ g(x) = \sum_{m=1}^{\infty}\frac{\int_a^b g(y)f_m(y) w(y)dy}{\int_a^b f(y)_m^2w(y)dy} f_m(x). $$ Finding the eignevalues, the eigenfunctions, and expanding in such a Fourier series is the goal of Sturm-Liouville Theory. Using this, you can solve various PDEs using Fourier's separation of variables technique.