You got an upper estimate that is $O(h)$. Nothing is preventing the left side to be of order $O(h^2)$, except that then the standard proof of the accuracy order would be highly inefficient.
Next you got provided an example where the left side is actually $O(h)$. This shows that there can be no general upper estimate of order $O(h^2)$, even if in special cases, $y'=const$, the method is exact.
In the general philosophy that things that go wrong (under cleanly formulated assumptions) do so in a generic manner, examples where the actual accuracy is better than $O(h)$ are the exception.
There is a theorem which states the following.
Let $u:I⊂ℝ→ℝ$ be given by
\begin{align*}u^{(n)} &= F(t,u,u',…,u^{(n-1)})\\
u^{(k)}(t_0)&=u_k^0, \qquad k=0,…,n-1.
\end{align*}
This is called an explicit, scalar initial value problem (IVP) of order $n$. Then $u$ is also given as solution to the system of first order:
\begin{align*}
u'_{n-1}(t)&=F(t,u_0(t),u_1(t),…,u_{n-1}(t)), & u_{n-1}(t_0)&=u_{n-1}^0 \\
u'_{n-2}(t)&=u_{n-1}(t), & u_{n-2}(t_0)&=u_{n-2}^0 \\
\vdots& \\
u'_{0}(t)&=u_{1}(t),& u_{0}(t_0)&=u_{0}^0
\end{align*}
using the notation $u_0(t)=u(t)$ and $u_k=\frac{d^k}{dt^k}u$.
What this theorem states, is that you can rewrite a IVP of higher order to a system of ODEs of order one. Written as a matrix system, if $F$ is linear, you will get
$$u' = Au + b.$$
Note, that your ODE is in fact a linear ODE.
The reason why this theorem is important, is that all theories that are developed for order one IVPs, $u'=f(t,u)$, can be applied to such systems of order one. Especially you can use all numerical solvers like the Explicit Euler or RK-methods.
In your case (using your first result $(*)$) that equation is:
$$\begin{bmatrix} u_1' \\ u_2'\\u_3' \end{bmatrix} = \begin{bmatrix}0 & 1 & 0\\ 0& 0 & 1 \\ -1/x & -x &-1 \end{bmatrix} \begin{bmatrix}u_1 \\ u_2 \\ u_3 \end{bmatrix} +\begin{bmatrix}0\\0\\x\end{bmatrix}$$
Note, that you have to also include the equation like $u_1' = u_2$ in that system. Also note, that you can't have any $y$'s in there anymore.
That system could be solved by any ODE solver, e.g. Explicit Euler:
$$u^{(t+1)} = u^{(t)} + h(Au^{(t)}+b).$$
Is it clear how to set up your iteration from here?
Now for your other case $(**)$. You will get the following system, modulo my calculation errors:
$$\begin{bmatrix}1 & 0 & 0\\ 0& 1 & 0 \\ x^2 & x & x \end{bmatrix}\begin{bmatrix} u_1' \\ u_2'\\u_3' \end{bmatrix} = \begin{bmatrix}0 & 1 & 0\\ 0& 0 & 1 \\ -1 & 0 &0 \end{bmatrix} \begin{bmatrix}u_1 \\ u_2 \\ u_3 \end{bmatrix} +\begin{bmatrix}0\\0\\1\end{bmatrix}$$
So this is something like $$Mu'=\tilde{A}u+\tilde{b}.$$ If you would multiply by $M^{-1}$, you get the same system as above.
For numerical methods, the second system has the disadvantage, that you can't directly apply most ODE-Solver to that system. So I don't see any reason to write it in this form.
But we can just try to create an Explicit Euler Method for these kind of matrix systems:
Replace $$u_i' \approx \frac{u_i(t+h) - u(t)}{h} = \frac{u_i^{(t+1)} - u^{t}}{h} $$
yields
\begin{align*}
u_1^{(t+1)} &= u_1^t + hF_1 \\
u_2^{(t+1)} &= u_2^t + hF_2 \\
x^2u_1^{(t+1)}+xu_2^{(t+1)}+xu_3^{(t+1)} &= x^2u_1^{(t)}+xu_2^{(t)}+xu_3^{(t)} + hF_3
\end{align*}
And that translates to:
$$Mu^{(t+1)}=Mu^{(t)}+h(\tilde{A}u^{(t)}+\tilde{b}).$$
Again, if you would mutlipy by $M^{-1}$ you would get the Explicit Euler Method for the system $u'=Au+b$.
The reason why this works, is that we assumed $F$ to be a linear function, and the difference quotient is also a linear operator in $f$.
Best Answer
Yes, one can do that, this is called the Taylor series method. Most efficiently you do this with methods of algorithmic differentiation, that is, you capture the right side as computational graph, equip each node with space for the Taylor series coefficients of that node, and populate these coefficients with increasing degree. Straight evaluation gives $y'$, augmenting the input by the linear term allows to augment all nodes, giving $y''$ in the end, then augment the input to a quadratic polynomial etc.
One rather old package that does this efficiently is the TADIFF part of the FADBAD, now both FADBAD++ package with applications to certified error bounds of ODE solutions
Local examples for the Taylor series method are https://math.stackexchange.com/a/2953714/115115 and https://math.stackexchange.com/a/3663073/115115, while https://math.stackexchange.com/a/2499684/115115 explores the Taylor arithmetic aspect of a simple example.
More frequently you will encounter the first derivative of the ODE function, as part of a Newton-like method in implicit methods, in determining the split in splitting methods, which leads to Rosenbrock and similar methods. The latter can be seen as an amalgamation of the Taylor series and Runge-Kutta approaches.