You are missing something in establishing your system. In extracting the rational constants, the powers of $h$ are still bound to the derivatives, thus the system has to be
\begin{align*}
\begin{bmatrix}
1 & -2 & 2 & -\frac{4}{3} \\
1 & -1 & \frac{1}{2} & -\frac{1}{6} \\
1 & 1 & \frac{1}{2} & \frac{1}{6} \\
1 & 2 & 2 & \frac{4}{3} \\
\end{bmatrix}
\begin{bmatrix}
f(x_0) \\
f'(x_0)h \\
f''(x_0)h^2 \\
f'''(x_0)h^3 \\
\end{bmatrix}
&\approx
\begin{bmatrix}
f(x_0 - 2h) \\
f(x_0 - h) \\
f(x_0 + h) \\
f(x_0 + 2h) \\
\end{bmatrix} \\
\end{align*}
with the solution formula
\begin{align*}
\begin{bmatrix}
-\frac{1}{6} & \frac{2}{3} & \frac{2}{3} & -\frac{1}{6} \\
\frac{1}{12} & -\frac{2}{3} & \frac{2}{3} & -\frac{1}{12} \\
\frac{1}{3} & -\frac{1}{3} & -\frac{1}{3} & \frac{1}{3} \\
-\frac{1}{2} & 1 & -1 & \frac{1}{2} \\
\end{bmatrix}
\begin{bmatrix}
f(x_0 - 2h) \\
f(x_0 - h) \\
f(x_0 + h) \\
f(x_0 + 2h) \\
\end{bmatrix}
&\approx
\begin{bmatrix}
f(x_0) \\
f'(x_0)h \\
f''(x_0)h^2 \\
f'''(x_0)h^3 \\
\end{bmatrix} \\
\end{align*}
You should also recognize that because of the symmetry, the coefficients of the 4th derivative cancel so that the error term for $f'''(x_0)h^3$ is $O(h^5)$. As you have to divide by $h^3$, the error term for $f'''(x_0)$ is $O(h^2)$.
You can greatly reduce the computations by recognizing the symmetry of the situation and start with the Taylor expansions of
$$
f(x_0+kh)-f(x_0-kh)=2f'(x_0)kh+\frac13f'''(x_0)k^3h^3+\frac1{60}f^{(5)}(x_0)k^5h^5+O(h^7)
$$
so that you get for the coefficients $$2A_2+A_1=0\\8A_2+A_1=3$$
Modern Computer Arithmetic suggests using an arithmetic-geometric mean algorithm. I'm not sure if this approach is meant for the low amount of precision one typically works with or if its meant for calculation in very high precision.
Another approach is to observe that the Taylor series for $\ln(x)$ is efficient if $x$ is very close to $1$. We can use algebraic identities to reduce the general case to this special case.
One method is to use the identity
$$ \ln(x) = 2 \ln(\sqrt{x})$$
to reduce the calculation of $\ln(x)$ to that of an argument closer to 1. We could use a similar identity for more general radicals if we can compute those efficiently.
By iteratively taking roots until we get an argument very close to $1$, we can reduce to
$$ \ln(x) = m \ln(\sqrt[m]{x})$$
which can be computed by the Taylor series.
If you store numbers in mantissa-exponent form in base 10, an easy identity to exploit is
$$ \ln(m \cdot 10^e) = e \ln(10) + \ln(m)$$
so the plan is to precompute the value of $\ln(10)$, and then use another method to obtain $\ln(m)$, where $m$ is not large or small.
A similar identity holds in base 2, which a computer is likely to use.
A way to use lookup tables to accelerate the calculation of $\ln(x)$ when $x$ is not large or small is to observe that
$$ \ln(x) = \ln(k) + \ln(x/k) $$
The idea here is that you store a table of $\ln(k)$ for enough values of $k$ so that you can choose the $k$ nearest $x$ to make $x/k$ very near $1$, and then all that's left is to compute $\ln(x/k)$.
Best Answer
First, note that $$f(x+h) \simeq h*f'(x) + f(x) = f(x) + f'(x)h$$ is an approximation, not an equality.
The error of the estimate can be written explicitely by the Lagrange Remainder, a more general version of the Mean Value Theorem. This result is exactly what they use here.