Substituting the second equation for $y$ into the first equation, we get
$$(x^3+1)^3 = x+2 \iff (x^3+1)^3 - x - 2 = 0.$$
Now consider the polynomial $f(x) = (x^3+1)^3 - x - 2$, for which any root yields a solution to the original system. Thus we want to analyze $f$ in such a way that we can prove there is only one real root. We can see that for any $x$ with $|x| \leq \frac{1}{2}$ we have that
$$ |(x^3+1)^3 - x| \leq (|x|^3+|1|)^3 + |x| \leq (|.5|^3+1)^3 + |.5| < 2,$$
and so $f$ is negative on the interval $\left[-\frac{1}{2},\frac{1}{2}\right]$. Additionally, if $-1 < x < -\frac{1}{2}$ then we have that
$$(x^3+1)^3 - x < ((-.5)^3+1)^3 + 1 < 2$$
and if $-2 < x < -1$ we have that
$$(x^3+1)^3 - x < ((-1)^3+1)^3 + 2 = 2.$$
Thus $f$ is negative on the interval $\left(-2,\frac{1}{2}\right]$.
Now take the derivative to get
$$ f'(x) = 9x^2(x^3+1)^2 - 1,$$
and notice that $f'(x) > 0$ for every $x \in (-\infty,-2]\cup \left(\frac{1}{2},\infty\right)$. Therefore $f$ is monotonically increasing in this region. All together, we see that $f$ is negative for $x \leq \frac{1}{2}$ and monotonically increasing for all $x > \frac{1}{2}$; thus $f$ has exactly 1 real root.
So you have correctly identified the purpose as,
transforming a nth order DE into a system of n first order DE's, and writing that system as a single matrix equation
and this is not insignificant. It greatly simplifies (1) the statement of any theorems regarding a control system as we need only consider first-order DEs and (2) the statement of any algorithm and procedure as we need only consider 1st-order DEs. Everything we can claim to do with a first-order [vector] ODE we can apply generally to higher-order scalar ODE. As a very simple example, consider the following theorem,
Theorem: Let $x(t)\in\mathbb{R}^n.$ Consider the system $$\dot{x}(t) = f(x(t))$$ and suppose $f(0) = 0,$ $f$ is continuously differentiable and define, $$A = \left.\frac{\partial f}{\partial x}\right|_{x = 0}.$$ If $A$ is Hurwitz then the system is locally exponentially stable at $0.$
This tells us that we can ascertain the stability of a first-order nonlinear system near an equilibrium by just looking at the Jacobian at the equilibrium point. Since this is written in state-space form and applies to the vector equivalent, it means we have a way to ascertain stability of any higher-order nonlinear ODE: just write it in state-space form and use this result. Imagine, for a moment: how would you write this theorem if you had to only work with the original, possibly higher-order, nonlinear ODE? How would it be described? (It is in principle doable)
It gets even more interesting when we want to do control design. A very commonly studied class of nonlinear systems are those of the control-affine variant, i.e.
$$\dot{x}(t) = f(x(t)) + g(x(t))\,u(t).$$
How do we control these systems? One method is that of feedback linearization, which asks whether or not we can apply feedback to cancel out all the nonlinearities and make the system behave exactly like a linear one (in possibly different coordinates $z(t) = \Phi(x(t))$). How do we know when its possible? There are conditions on $f$ and $g$ that tell us when this problem is solvable. Since these are conditions on the state-space formulation and apply even when $x$ is a vector, the results apply equally well to higher-order nonlinear control systems. We just rewrite them in state-space form.
There are, of course, exceptions. If you have a singular ODE then it may not have a state-space formulation. But, most systems are not singular in practice, so they are often not studied nor presented in most literature on control theory (although there are those that study such systems).
The upshot here is that by analyzing the state-space formulations, we can describe techniques that apply equally well to higher-order systems. It is to make our lives easier by giving a standard presentation that covers both lower and higher order systems equally. By the way, it isn't just state-space formulations that are used in the nonlinear case. We also work with input-output formulations for nonlinear systems; for linear systems the most commonly used input-output formulation is the transfer function! Systems are almost always rewritten in the input-output (transfer functions, operators, etc.) or state-space formulation.
As a side note, try to design a controller meeting specific settling-time requirements for a 4th-order linear ODE with some zero dynamics. Do not go into Laplace domain nor into state-space form. I think you'll find yourself having a hard time. I sure would. The difficulty only ramps up when you have multiple-inputs and multiple-outputs. It is doable but I think the procedures are more cleanly presented in state-space/input-output than in the original ODE. That itself to me makes it worth the while. I wouldn't even bother working with the nonlinear ODE without transforming it into an input-output representation (operator of some kind) or state-space representation.
Best Answer
We can solve the equation even algebraically. The solution is given by $$ x=\frac{1}{3}(4y^2( 4 - y^2)), $$ where $y$ satisfies the equation $$ 4y^6 - 16y^4 + 9=0. $$ This is a cubic equation in $z=y^2$, where we have a formula. We obtain four real solutions and two complex ones.