[Math] How was the Runge-Kutta method derived

numerical methodsordinary differential equationsrunge-kutta-methods

By K values, I mean the values described here:

https://en.wikipedia.org/wiki/Runge%E2%80%93Kutta_methods#Explicit_Runge.E2.80.93Kutta_methods

I know how the K values in the Runge-Kutta method can be proven to be correct, by comparing their taylor expansion with the taylor expansion of the function to be approximated, but how were they originally figured out?

I think I understand the Runge-Kutta method derivation when you have the derivative in terms of one variable f'(t). It seems to be a direct consequence of Simpson's rule and its higher order equivalents. But when it is some form of first order differential equation (i.e. f'(t, y(t))), I am still lost. Is there an equivalent of Simpson's rule for multivariable functions?

Best Answer

I figured out the answer to my question, with the help of the Peter in the comments of this question. I decided to post what I've found here in case it might help other people (because I can't seem to find a good explanation of this anywhere else online).

First of all, for those who do not know, the Runge-Kutta method is a method of solving first order differential equations numerically. Explicitly, our task is this:

Given a derivative $\frac{\mathrm{dy}}{\mathrm{dx}} = y'(x, y)$ and an initial value for its antiderivative $y(x_n) = y_n$, approximate $y(x_{n+1}) = y_{n+1}$ where the approximation gets better as $x_{n+1} - x_n$ goes to $0$.

I've found that looking at the more general problem first makes this difficult to understand, so instead we will look at the slightly simpler case where the derivative is only a function of $x$ and not of $y$. In other words, $\frac{\mathrm{dy}}{\mathrm{dx}} = y'(x)$

The most basic and straight forward way of doing this is called Euler's Method. Essentially it works by moving in the direction of the slope of $y$ at $x_n$. Symbolically:

$$ y_{n+1} \approx y_n + (x_{n+1} - x_n)y'(x_n) $$

enter image description here

As you can see by the image, this is okay at approximating the next point, but it isn't great. We'd like to somehow find a better approximation.

So here is where I was getting confused. Remember how I said that Euler's method works by moving in the direction of the tangent line? Well, that is true, but it isn't very helpful, as it doesn't really allow us to make our approximation better. We are given the function that gives us the tangent line at any $x$ value exactly so there is no room for improvement other than making $x_{n+1} - x_n$ smaller. Instead, we should notice that, in our approximation, if we move things around a bit, we get the following

$$ y_{n+1} - y_n \approx (x_{n+1} - x_n)y'(x_n) $$ Why is this important? Well $y_{n+1} - y_n$ is actually just the integral of $y'(x)$ from $x_n$ to $x_{n+1}$. So in reality, Euler's method is just a direct result of the Riemann Sum approximation of the integral!

$$ \int_{x_n}^{x_{n+1}} y'(x) \mathrm{dx} \approx (x_{n+1} - x_n)y'(x_n) $$

From this, we can generalize Euler's method as direct result of the Fundamental Theorem of Calculus. In other words

$$ y_{n+1} = y_n + \int_{x_n}^{x_{n+1}} y'(x) \mathrm{dx} $$

And better approximations of the integral lead to better approximations of $y_{n+1}$! This is the basis of the Runge-Kutta method.

How do we get better approximations of the integral? Well, a Riemann Sum uses the area under a constant that goes through one point of the function to approximate the integral.

Left Riemann Sum

But a better approximation would be the Trapezoidal method which uses the area under a straight line that goes through two points.

Trapezoidal Method

An even better approximation would be Simpson's method, which uses a parabola that goes through three points.

enter image description here

The pattern here is that we can use an $n_{th}$ degree polynomial using $n$ points to get better and better approximations of the integral as $n$ increases (the fact that these approximations get better is a direct result of Taylor's Theorem). So what makes this better than using a smaller step size in Euler's method? Well, because this is a result of Taylor's Theorem, with Euler's method, the approximation gets better linearly as we decrease the step size, while with a method of higher order (say using a polynomial of degree $n$) if we decrease the step size by $\delta$, then the approximation gets better by approximately $\delta^n$.

However, when we try to apply these methods to the more general case of $\frac{\mathrm{dy}}{\mathrm{dx}} = y'(x, y)$ we run into a slight problem. We are only given one value of $y$ (namely $y(x_0)$) but we need $n$ points on $y'$ for an $n_{th}$ order Runge-Kutta. So what do we do?

Unfortunately, we have to approximate other values of $y$ with lower order methods before we can use the higher order methods. We lose a bit of accuracy here, but as it turns out, this is generally not too big of a deal, because the approximation still gets better with $\delta^n$.

So lets say $R_n(y', y_{n-1})$ is the function that gets the next value, $y_{n}$, using the $n_{th}$ order Runge-Kutta method. Then, the following conditions hold:

$$R_n(y', y_{n-1}) \approx R_n(y', R_{n-1}(y', y_{n-2}))$$ $$R_1(y', y_0) = (x_1 - x_0) y'(x_0, y_0)$$

And this is gives rise to the "K" values mentioned in the question.

Related Question