Let me work with $n$ dimensions: you want to study the vector field
$$
X=\sum_{1\le j\le n} a_j(x)\frac{\partial}{\partial x_j},
\tag {1}$$
and in particular find the so-called first integrals of $X$ i.e. the functions $f$ such that $Xf=0$. You introduce the system of ODE:
$$
\dot x(t,y)=a(x(t,y)),\quad x(0,y)=y.
\tag {2}$$
The solutions $t\mapsto x(t,y)$ are the integral curves of $X$.
You realize easily that a function is a first integral iff it is constant along the integral curves of $X$: just compute
$$
\frac{d}{dt}\bigl(f(x(t,y))\bigr)=\sum_{1\le j\le n} \frac{\partial f}{\partial x_j}(x(t,y))a_j(x(t,y))=(Xf)(x(t,y))
$$
It means that solving the PDE (1) is somehow equivalent to solving (2).
Now the notational business. It is tempting to write (2), which is $
\frac{dx_j}{dt}=a_j(x), 1\le j\le n,
$
symbolically as
$$
\frac{dx_1}
{a_1(x)}=\dots=\frac{dx_n}
{a_n(x)}
$$
since they are all equal to $dt$ ! Well just take this as a symbolic notation which eliminates the presence of the parameter $t$.
Now the Cauchy problem for this autonomous vector field $X$: find an hypersurface $\Sigma$ to which $X$ is transverse, i.e. $X$ is not tangent to $\Sigma$. Then the Cauchy problem
$$
\begin{cases}
Xu=f,\quad \\
u_{\vert \Sigma}=g
\end{cases}
$$
has locally a unique solution: this problem is equivalent to the scalar ODE
$$
\frac{d}{dt}\bigl( u(x(t,y))\bigr)=f(x(t,y)),\quad u(x(0,y))=u(y)=g(y) \text{ for $y\in \Sigma$},
$$
so that
$$
u(x(t,y))= u(y)+\int_0^tf(x(s,y)) ds\quad \text{ for $y\in \Sigma$}.
\tag{3}$$
Note that $y$ moves on $\Sigma$ ($(n-1)$ degree of freedom) and $t$ in $\mathbb R$ so that it is a nice choice of coordinates to pick $y\in \Sigma$ and $t\in \mathbb R$.
There are variants of this when the vector field is not autonomous, i.e. is of type
$$\frac{\partial}{\partial t}+
\sum_{1\le j\le n} a_j(t,x)\frac{\partial}{\partial x_j}.
$$
More comments on the quasi-linear case and the general method of characteristics:
the quasi-linear Cauchy problem
$$
\frac{\partial u}{\partial t}+\sum_{1\le j\le n} a_j(t,x, u)\frac{\partial u}{\partial x_j}=b(t,x,u),\quad u(0,x)=u_0(x).
\tag{4}$$
has a linear companion
$$
\frac{\partial F}{\partial t}+\sum_{1\le j\le n} a_j(t,x, v)\frac{\partial F}{\partial x_j}+b(t,x,v)\frac{\partial F}{\partial v}=0,\quad F(0,x,v)=v-u_0(x)
\tag{5}$$
where $t,x,v$ are independent variables. It is not difficult to solve using the linear method of characteristics outlined above. Then since $\partial F/\partial v=1$ at $t=0$, the equation
$
F(t,x,v)=0
$
determines implicitely $v=u(t,x)$ and the expression of derivatives of $u$ in terms of derivatives of $F$, e.g.
$
\partial u/\partial x=-\frac{\partial F/\partial x}{\partial F/\partial v}
$
imply that $u$ solves the Cauchy problem (4). Here also the notational industry is working full throttle. People would write
$$
\dot x=a(t,x,u)\quad \dot u=b(t,x,u)\quad
\text{which is }
\frac{dx_j}{a_j}=\frac{du}{b},\quad 1\le j\le n.
$$
For simplicity, consider solutions where $ u$ does not depend on $x, y$:
$A u_t + B(t) u = 0$. If $y^T A = 0$, that says $y^T B(t) u = 0$, so $u$ is
restricted to belong to a certain (possibly $t$-dependent) subspace.
Thus for your example $$A = \pmatrix{1 & 0 & 0 & 0\cr 0 & 0 & 0 & 0\cr 0 & 0 & 0 & 0\cr 0 & 0 & 0 & -1\cr}$$
$(B(t) u)_2 = (B(t) u)_3 = 0$. If the appropriate $2 \times 2$ submatrix of $B(t)$ is invertible, this lets you express $u_2$ and $u_3$ in terms of $u_1$ and $u_4$, and you get a periodic linear system for $u_1$ and $u_4$.
The solutions
are usually not periodic in $t$. Rather, the linear operator $u(0) \to u(2\pi)$ will have eigenvalues $\lambda$ corresponding to solutions where $u(2\pi) = \lambda u(0)$ (see Floquet theory).
Somewhat more generally, solutions of the form $u(x,y,t) = \exp(\alpha x + \beta y) v(t)$
lead to the same type of system (with $\alpha \cos(t) + \beta \sin(t)$ added to $B$).
EDIT: If the submatrix of $B(t)$ is not invertible for some $t$, you may find that some or all of the nontrivial solutions have singularities at those $t$.
Best Answer
Chapter VII of É. Goursat's book, Leçons sur l'intégration des équations aux dérivées partielles du premier ordre, exposes the method and ends with 14 examples of applying it (pp. 168-169).