Very briefly, a classical theory is a gauge theory if its field variables $\varphi^i(\vec{x},t)$ have a non-trivial local gauge transformation that leaves the action $S[\varphi]$ gauge invariant. Usually, a gauge transformation is demanded to be a continuous transformation.
[Gauge theory is a huge subject, and I only have time to give some explanation here, and defer a more complete answer to, e.g., the book "Quantization of Gauge Systems" by M. Henneaux and C. Teitelboim. By the word local is meant that the gauge transformation in different space-time point are free to be transformed independently without affecting each others transformation (as opposed to a global transformation). By the word non-trivial is meant that the gauge transformation does not vanish identically on-shell. Note that an infinitesimal gauge transformation does not have to be on the form
$$\delta_{\varepsilon}A_{\mu}(\vec{x},t) = D_{\mu}\varepsilon(\vec{x},t),$$
nor does it have to involve a $A_{\mu}$ field. More generally, an infinitesimal gauge transformation is of the form
$$\delta_{\varepsilon}\varphi^i(x) = \int d^d y \ R^i{}_a (x,y)\varepsilon^a(y),$$
where $R^i{}_a (x,y)$ are Lagrangian gauge generators, which form a gauge algebra, which, in turn, may be open and reducible, and $\varepsilon^a$ are infinitesimal gauge parameters. Besides gauge transformations that are continuously connected to the identity transformation, there may be so-called large gauge transformations, which are not connected continuously to the identity transformation, and the action may not always be invariant under those. Ultimately, physicists want to quantize the classical gauge theories using, e.g., Batalin-Vilkovisky formalism, but let's leave quantization for a separate question. Various subtleties arise at the quantum level as, e.g., pointed out in the comments below. Moreover, some quantum theories do not have classical counterparts.]
Yang-Mills theory is just one example out of many of a gauge theory, although the most important one. To name a few other examples: Chern-Simons theory and BF theory are gauge theories. Gravity can be viewed as a gauge theory.
Yang-Mills theory without matter is called pure Yang-Mills theory.
The Lagrangian of Yang-Mills theory coupled to scalars/fermions, etc. takes the form
$$
{\cal L}_{YM} = - \frac{1}{2} \text{Tr} F_{\mu\nu} F^{\mu\nu} + (D_\mu \phi) (D^\mu \phi)^* + i {\bar \psi} \gamma^\mu D_\mu \psi + \cdots
$$
where the $\cdots$ represents other interactions terms that might be present. Let me explain the notation in the above expression.
$\phi_i$ and $\psi_i$ are multiplets in some representation $R$ of the gauge group $G$. Here, $i = 1, \cdots, \dim R$
The generators in representation $R$ are denoted as $T^a$. These are normalized to satisfy
$$
[T^a, T^b] = i f^{ab}{}_c T^c,~~~~ \text{Tr} ( T^a T^b )= \frac{1}{2}\delta^{ab}
$$
In other words, the $T_{ij}^a$'s are just some set of matrices satisfying the above properties.
The covariant derivatives acting on the fields is
\begin{align}
(D_\mu \phi)_i &= \partial_\mu \phi_i - i g T^a_{ij} A_\mu^a \phi_j \\
(D_\mu \psi)_i &= \partial_\mu \psi_i - i g T^a_{ij} A_\mu^a \psi_j \\
\end{align}
where $(A_\mu)_{ij} = A_\mu^a T_{ij}^a$.
$$
F_{\mu\nu} = \partial_\mu A_\nu - \partial_\nu A_\mu + g [A_\mu, A_\nu ] \\
$$
Explicitly
$$
F_{\mu\nu}^a = \partial_\mu A^a_\nu - \partial_\nu A_\mu^a + i g f_{bc}{}^a A^b_\mu A_\nu^c
$$
This completely specifies the Lagrangian of Yang Mills theory. You can now use the variational principle to determine the equations of motion.
Best Answer
A single field does not transform in all representations. A particular field transforms in a particular representation and you can have more than one field, each transforming in their own representations.
A fundamental field $\phi_i$ transforms as $$ \phi_i \to \phi_i - i \theta^a (T^a_{fund})_{ij} \phi_j + O(\theta^2). $$ An adjoint field $\phi^a$ transforms as $$ \phi^a \to \phi^a - i \theta^a(T^a_{adj})^{bc} \phi^c + O(\theta^2) = \phi^a + f^{abc} \theta^b \phi^c + O(\theta^2) . $$ etc.
As a side note, it is often said that a gauge field transforms in the adjoint, but its transformation is not the same as above. This is because the gauge field is a connection. It transforms as $$ A_\mu^a \to A_\mu^a + f^{abc} \theta^b \phi^c + \partial_\mu \theta^a + O(\theta^2) $$ Notice that this transformation is basically the same as that of $\phi^a$ but it has an extra term.