The truncation error does not satisfy that equation, it's just its definition.
Consider two following problems:
- The first is an ODE.
$$
y'(t) = f(t, y(t))\\
y(0) = a.
$$
Its solution is some smooth function $y(t)$.
- The second is a difference equation
$$
\frac{z_{i+1} - z_i}{h} = f(t_i, z_i)\\
z_0 = a.
$$
Its solution is some discrete function $z_i$.
I've intentionally used different letters to denote those two solutions. They are quite different, the former is a smooth function while the latter is a discrete one. One needs to be careful even to compare those two. Usually the third function is introduced. It is defined as a restriction of the smooth $y(t)$ to the grid $t_i$, where the discrete function $z_i$ is defined. Let's denote the restriction as $w_i$:
$$
w_i \equiv y(t_i).
$$
The function $w_i$ is discrete just like $z_i$ and $w_i$ coincide with $y(t)$ at grid points. Since now $w_i$ and $z_i$ are functions of the same class we can easily compare them:
$$
e_i = w_i - z_i \equiv y(t_i) - z_i.
$$
So, roughly speaking, the global error shows how close are $y(t)$ and $z_i$ (by restricting the former to the grid). When someone is solving some problem numerically the global error is what he is interesting in. Anyway, direct computation of global error is almost impossible, since we often simply do not have the exact values of $w_i = y(t_i)$ (
in contradistinction to $z_i$, which we can compute easily).
And the local truncation error concept comes to the rescue. Note that previously we've compared the solutions. Now we're going to compare problems. Take $z_i$. It is the solution to the second problem. Plugging $z_i$ into it makes it a valid identity
$$
\frac{z_{i+1} - z_i}{h} = f(t_i, z_i)\\
z_0 = a.
$$
But if we now take $w_i$ and try to plug it into the difference scheme we wont get an identity. Instead we'll get a residual:
$$
\frac{w_{i+1} - w_i}{h} = f(t_i, w_i) \color{red}{{}+ d_i}\\
w_0 = a \color{red}{{} + d_0}.
$$
If we are very lucky, some residuals may vanish, like $d_0$, but often it is not the case.
So why is $d_i$ interesting while it also is defined in terms of $w_i$ (the unknown solution to the original problem)? It turns out that we can estimate the $d_i$ without knowing the exact values of $w_i$ by just knowing the original problem.
$$
d_i = \frac{w_{i+1} - w_i}{h} - f(t_i, w_i) \equiv
\frac{y(t_{i+1}) - y(t_i)}{h} - f(t_i, y(t_i)) = \\ =
y'(t_i) + h \frac{y''(t_i)}{2} + O(h^2) - f(t_i, y(t_i)) = \\ =
\color{blue}{\left[y'(t_i) - f(t_i, y(t_i))\right]} + \color{red}{h \frac{y''(t_i)}{2} + O(h^2)}
$$
The blue term in braces is exactly the original ODE, and $y(t)$ is exactly its solution. So the term is equal to zero.
$$
d_i = h \frac{y''(t_i)}{2} + O(h^2).
$$
Similar result may be obtained if using different form of Taylor's formula:
$$
d_i = h \frac{y''(\xi_i)}{2}, \qquad \xi_i \in [t_{i}, t_{i+1}].
$$
So now we can estimate the local truncation error, but we're interested in the global error.
To relate them we need to introduce another concept of stability. Consider the two discrete problems
$$
\begin{aligned}
&\frac{z_{i+1} - z_i}{h} = f(t_i, z_i)\\
&z_0 = a
\end{aligned}
\qquad\text{and}\qquad
\begin{aligned}
&\frac{w_{i+1} - w_i}{h} = f(t_i, w_i) \color{green}{{} + d_i}\\
&w_0 = a \color{green}{{} + d_0}
\end{aligned}.
$$
Pretend that we know $d_i$. Let's view the second problem as a perturbation of the first one. That's reasonable, since $d_i$ is a small value of $O(h)$ magnitude. A difference problem is called stable if such small perturbations result in small changes of the solution. For this case this means that the difference $z_i - w_i$ will also be small. Precisely
$$
\max_i |z_i - w_i| \leq C \max_i |d_i|
$$
where $C$ is called the stability constant of the method. For the explicit Euler method it can be shown that for Lipschitz-continuous $f$
$$
C \leq e^{LT}
$$
with $L$ being the Lipschitz constant of $f$ and $T$ is the total integration time $T = \max_i t_i$.
Finally we can relate the global error and the local truncation error by
$$
|e_i| \leq C \max_i |d_i|
$$
If the local truncation error tends to zero when the discrete mesh is refined the numerical method is called consistent. The Lax theorem states that a stable consistent method converges, in sense that $e_i \to 0$ when the mesh is refined.
Your ODE is $$y'=xe^{2x}-4y$$ with the form of the exact solution $$y(x)=(Ax+B)e^{2x}+Ce^{-4x}.$$ This inserted gives the coefficient conditions
$$
y'+4y=(A+6(Ax+B))e^{2x}\implies A=\frac16, B=-\frac1{36}.
$$
For the Taylor method we use the derivatives of the ODE to compute the further derivatives of $y$. From the extended Leibniz product rule $$(xf(x))^{(k)}=f^{(k)}(x)+kf^{(k-1)}(x)$$ we get in general $$y^{(k)}=2^{k-1}(2x+k)e^{2x}-4y^{(k-1)}$$ which for up to 4th order are
\begin{align}
y''&=(2x+1)e^{2x}-4y'\\
y'''&=(4x+4)e^{2x}-4y''\\
y^{(4)}&=(8x+12)e^{2x}-4y'''
\end{align}
This we then use to compute the Taylor step (in python, needs to import exp
)
def Taylor(y0,N):
h = 2.0/N
x, y = 0, y0
for k in range(N):
y1 = x*exp(2*x)-4*y
y2 = (2*x+1)*exp(2*x)-4*y1
y3 = (4*x+4)*exp(2*x)-4*y2
y4 = (8*x+12)*exp(2*x)-4*y3
x, y = x+h, y+h*y1+h**2/2*y2+h**3/6*y3+h**4/24*y4
return y
Add some code to test this method
def y_exact(x,y0): return ((6*x-1)*exp(2*x) + (36*y0+1)*exp(-4*x))/36
y0 = 1
y2 = y_exact(2,y0)
for N in [5, 10, 100, 500, 750, 1000]:
yT1 = Taylor(y0,N);
yT2 = Taylor(y0,2*N);
print N, yT1, yT1-y2
print 2*N, yT2, yT2-y2, (yT1-y2)/(yT2-y2)
This yields the following values, errors and ratios
N y_Taylor(2) error ratio
5 16.626489487 -0.056623359684
10 16.6798692947 -0.00324355204858 17.4572070483
10 16.6798692947 -0.00324355204858
20 16.6829256052 -0.000187241566536 17.3228205071
100 16.6831125714 -2.75283085216e-07
200 16.6831128297 -1.70001754896e-08 16.1929555013
500 16.6831128463 -4.31953139923e-10
1000 16.6831128467 -2.68691735528e-11 16.0761602539
750 16.6831128466 -8.58904058987e-11
1500 16.6831128467 -3.74456021746e-12 22.9373814042
1000 16.6831128467 -2.68691735528e-11
2000 16.6831128467 -5.60618218515e-12 4.7927756654
This table confirms the general situation. There is a working range of $h$ where double the step size leads to a 16fold error. With $h$ above that working range the non-linearities of the ODE make the second and further terms of the error expansion noticeable, perturbing the error ratio 16. Below that working range the accumulation of the floating point noise over the larger number of steps becomes equal to or larger than the size of the method error, which makes the error ratio over a step size doubling a random number.
In summary, I do not see the problem you report. This is either due to a radically different initial condition or some error in implementing the Taylor method.
Best Answer
Yes, your objection is valid. In the error order computation one has to assume that $h$ is small enough. More precisely, one needs that the region where the Lipschitz constant and bound on the second derivative are taken contains the exact solution and the numerical solution for the range of $h\in[0,\bar h]$ under consideration.
If $\bar h$ increases, the numerical solution will have wilder deviations from the exact solution, as you said, and thus the region for the constants has to become larger. This again will increase the constants. As long as no division-by-zero is encountered, the error bound for the large step size $h=\bar h$ will again be correct, but it can become be horribly pessimistic for the smaller step sizes.
For the statements in the second half in the post, note that the claims in the first half and the considerations above are for a fixed integration interval of finite length. For large step sizes the error will lose its almost linear behavior and show growth that is polynomial in $h$. It will be rare to get actual divergence in this situation.
The divergence due to missing stability occurs for an unbounded integration interval. This divergence will be exponential or in the growth behavior of a geometric series. Mathematically this is slow, controlled. Numerically such a divergence can rapidly exhaust the range of the number type.