$$
Ay''-By^3+Cy=0
$$
This is an autonomous ODE. The method to solve it is :
$$
2A^2y''y'-2ABy^3y'+2ACyy'=0\quad\to\quad A^2(y')^2-\frac{AB}{2}y^4+ACy^2=c_1
$$
$$
Ay'=\pm\sqrt{\frac{AB}{2}y^4-ACy^2+c_1}
$$
$$
dx=A\frac{dy}{\sqrt{\frac{AB}{2}y^4-ACy^2+c_1}}\quad\to\quad x=A\int \frac{dy}{\sqrt{\frac{AB}{2}y^4-ACy^2+c_1}}
$$
$$
x=\sqrt{\frac{2A}{B}}\int \frac{dy}{\sqrt{(y^2-y_1^2)(y^2-y_2^2)}}=
\sqrt{\frac{2A}{B}} \frac{1}{y_2}\:\text{F}\bigg(\sin^{-1}(\frac{y}{y_1})\:,\:\frac{y_1}{y_2} \bigg)
$$
where $y_1^2$ and $y_2^2$ are the roots of $\quad \frac{AB}{2}Y^2-ACY+c_1=0$
F$(\:\:,\:\:)$ is the elliptic integral of first kind.
with the related inverse function :
$$
y=y_1\:\text{sn}\left(\sqrt{\frac{B}{2A}} \:y_2\: x \:\:,\:\:\frac{y_1}{y_2} \right)
$$
sn$(\:\:,\:\:)$ is the Jacobi elliptic sn function.
DAE vs. ODE
Almost any DAE system can be reduced to an ODE system. As this requires derivatives of the equation, the equations themselves have to be differentiable to the required order.
$\newcommand{\pd}[2]{\frac{\partial#1}{\partial#2}}$
In your example, you could, as per comment, solve the second equation for $y$ and insert into the first one. This is the same as taking the derivative of the second equation to get a differential equation for $y$,
$$
\pd{g}{t}(x,y,t)+\pd{g}{x}(x,y,t)\cdot f(x,y,t)+\pd{g}{y}(x,y,t)\cdot \dot y=0.
$$
As is visible, and also demanded by the implicit function theorem, this only works if $\pd{g}{y}$ is invertible. If that is not the case, further derivatives of the equations may give rise to a complete ODE system, the maximal number of necessary differentiations of any equation is the index of the DAE.
Consequently, any ODE system is an index-0 DAE system.
This process towards an ODE may fail, either because the equations are not smooth enough like in $x_1'=x_2,~ x_1=q$, when $q$ is not differentiable. But also the process of index determination can fail to stop, that is, there is no differentiation order at which one can extract explicit equations for the highest order derivatives. In other words, there may not be any consistent system state, consistent with all equations and their derivatives.
Usefulness of DAE
Especially physical systems can be encoded more closely to the physical description, the first principles, using DAE systems. This enables software like modelica where large systems are constructed from basic building blocks having an inner dynamic of their state and pins/variables connecting to the outside and other building blocks.
For instance, consider the pendulum as mechanical system restrained to a circle,
\begin{align}
\ddot x+\lambda x&=0
\\
\ddot y + g/m + \lambda y &= 0
\\
x^2+y^2-r^2&=0
\end{align}
or the corresponding first order system. While the algebraic equation is solvable for one of the variables, this will not give a dynamical equation for the Lagranian variable $\lambda$, one needs 2 derivatives to eliminate $\lambda$ and $3$ derivatives of the equations for an ODE for $\lambda$.
This system directly expresses the physical situation in Cartesian coordinates, containing the gravity force as gradient of the potential and the gradient of the surface with its multiplier as virtual force. While mathematically simpler, the transformation to polar coordinates as in the reduced pendulum equation loses this direct physical context.
Best Answer
The flow is an important idea in the sense that it allows us to examine the behavior of solutions to an ODE at a "larger scale." What I mean by this is that instead of tracking a single solution, emanating from a point, we can track an ensemble of solutions, emanating from a set of points, all at the same time, and we can use this to study how the ODE distorts these points.
This is particularly important in applications of ODEs in PDE, and especially important in continuum mechanics. For instance, let's consider the following transport equation: $$ \partial_t u(t,x) + a(x) \cdot \nabla u(t,x) =0 $$ where $a: \mathbb{R}^n \to \mathbb{R}^n$ is Lipschitz. If we let $\varphi : \mathbb{R} \times \mathbb{R}^n \to \mathbb{R}^n$ denote the flow map associated to $a$ then we have that $$ \frac{d}{dt} u(t,\varphi(t,x)) = \partial_t u(t,\varphi(t,x)) + \partial_t \varphi(x,t) \cdot \nabla u(t,\varphi(t,x)) \\ = \partial_t u(t,\varphi(t,x)) + a(\varphi(t,x)) \cdot \nabla u(t,\varphi(t,x)) =0 $$ which tells us that $u$ is "constant along the flow." In particular, if we specify the initial condition $u(0,x) = g(x)$, then $$ u(t,\varphi(t,x)) = u(0,\varphi(0,x)) = u(0,x) = g(x) $$ and hence we can solve for $u$ via $$ u(t,x) = g(\varphi^{-1}(t,x)) = g(\varphi(-t,x)). $$ This establishes a key relationship between the flow of a vector field and the "transport" operator / "convective derivative" induced by the vector field $a$. This plays an essential role in continuum mechanics in going back and forth between the material (Lagrangian) and laboratory (Eulerian) coordinate frames.
Since you already know that the flow map induces a $1-$parameter family of diffeomorphisms, I would suggest studying flows by trying to prove another one of the main theorems about them, Liouville's theorem. It says that $$ \det D \varphi(t,x) = \exp\left( \int_0^t \text{div}f(\varphi(s,x)) ds \right) $$ whenever $\varphi$ is the flow associated to $f$. This is another extremely important result in continuum mechanics, as it builds a link between incompressibility (the divergence free condition) and the local preservation of volume. Indeed, if $\text{div} f=0$, then Liouville shows that $$ \det D \varphi(t,x) =1 \text{ for all }t,x, $$ and so if $U \subseteq \mathbb{R}^n$ is a measurable set, then $$ |\varphi(t,U)| = \int_{U} |\det D \varphi(t,x)| dx = \int_U dx = |U|, $$
which tells us that the volume, or Lebesgue measure, of $U$ is preserved along the flow for all measurable sets.