I think that the problematic part here is the notion of what demands what. For example, you state
This is the standard argument by which Lorentz invariance is found to demand gauge invariance for massless particles.
but I am not completely sure if I agree with this, or at least to the interpretation you are carrying with it.
Taking a look at Weinberg's book on QFT[1], Chapter 5.9, I see a different way of looking at this issue. Weinberg notices that there is no consistent way of constructing an operator for the spin-1 massless field. The equivalent to your statement is that if we are to forcefully construct that operator, it has to be equivalent to itself in addition to a derivative of some other operator.
Now, in my interpretation (and I don't think that at the time of this reply there is an unanimous and unambiguous interpretation of this) is that the statement of gauge invariance is that of a single massless spin-1 field is only defined up to a sum of a divergence, and the important part is that this is a statement which is true even in the absence of matter coupled to the field.
My understanding of your question leads me to think that you are putting at the same footing two distinct notions:
- The intrinsic gauge invariance of a single massless spin-1 field, which is required to make sense of the corresponding operator.
- The gauge invariance of an interacting theory with matter coupled to a massless spin-1 field.
I point out this distinction because point 2. is the one that forces you to put a collection of massless spin-1 fields in the Adjoint representation, such that the kinetic term of matter with covariant derivatives is invariant under a gauge transformation of matter fields, whereas point 1. tells you that each of the massless spin-1 fields has an intrinsic gauge invariance where each is identical to itself plus the addition of a divergence.
I then propose you an interpretation that would solve this issue, and it goes like this:
- a) We know that in order to make sense of spin-1 massless fields, their operators have to enjoy a gauge invariance of the form
$$\hat A_\mu \to \hat A_\mu + \partial_\mu \hat \Omega $$
as such we allow for the classical counterpart to enjoy this freedom from the first place (point 1.).
- b) Due to the previous point, we notice that theories of local $U(1)$ symmetries of the matter fields
$$ \psi \to e^{i \theta(x)} \psi $$
coupled to matter
$$\mathcal L = i \overline \psi \gamma \cdot \partial \psi + \overline \psi \gamma \cdot A \psi $$
are immediately invariant as the new terms arising from the derivative terms, the divergence of the phase $\theta$, can be gauged away into the field $A_\mu$.
- c) For non-abelian theories, while it's feasible possible to construct them without being interacting with matter, their motivation comes from empirical evidence where they interact with matter. In such formulations we know that the theory is invariant if there is a collection of massless spin-1 fields that transform under the adjoint transformation (this is point 2.).
- d) Furthermore, because of the presence of derivatives, there will still be a divergence of a lie-algebra valued quantity (the phases) which can be gauged away by the collection of the massless spin-1 fields due to their intrinsic gauge invariance (point 1.).
In some sense you can read point c) as being what some people related closely to the global (a lot of grains of salt in this term) part of the gauge transformation, while d) the local part. In the end the theory works because the requirement for an interacting gauge theory plays along nicely with the quantum mechanical requirements for writing down the corresponding massless spin-1 fields!
Would this help you? I have thought about similar issues (I wrote, for example, [2] even though it's not direclty related to your issue) and I've come to believe that the interpretation I constructed and provided above solves a lot of conceptual problems in gauge theories (namely the simple Lie group type).
So to answer your questions explicitly:
- Each field operator (meaning each of the $a$, using your index notation) in the non-abelian cause transforms just as the abelian case under Lorentz transformations, that is with an inhomogeneous derivative term.
- The collection of fields (all the $a$) transforms in accordance with Coleman-Mandula as it transforms under the adjoint representation, plus each $a$ has the "freedom" to be defined up to a derivative. This freedom plays nicely in presence of matter where derivatives of phases appear, and in absence of matter can be used to write each operator in a convenient gauge.
References
[1] Quantum Theory of Fields - Vol 1, S. Weinberg
[2] Conceptual Challenges of Gauge Symmetry, Miguel Crispim Romão
https://www.academia.edu/5659587/Conceptual_Challenges_of_Gauge_Symmetry
Classically, the gauge field strength is a curvature of a connection, the same way that the Riemann tensor is. Since $F^a_{\mu\nu}$ lives in the adjoint representation of the gauge group, you can define a 4-index object very analogous to the Riemann tensor:
$$\mathcal{F}^a{}_{b\mu\nu} \equiv F^c_{\mu\nu} f_c{}^a{}_b,$$
where $f_c{}^a{}_b$ are the structure constants defined via
$$[t_c, t_b] = f_c{}^a{}_b \, t_a,$$
modulo factors of $i$ if you care to insert them. In any case, the object $\mathcal{F}^a{}_{b\mu\nu}$ contains information about parallel transport around infinitesimal loops, in the same sense that the Riemann tensor does. But the vector being translated is not a tangent vector; instead it is a vector in gauge space (or "internal" space), whose components are defined via expansion in the generators $t_a$, as in $V = V^a \, t_a$.
So, $\mathcal{F}^a{}_{b\mu\nu}$ describes the change in $V = V^a \, t_a$ as it is parallel-transported (via the covariant derivative $D_\mu \equiv \partial_\mu + A_\mu$, again modulo factors of $i$, etc.) around a small parallelogram in the directions $\partial_\mu, \partial_\nu$.
Best Answer
TL;DR: Vanishing field-strength $F=0$ does not imply that the gauge potential $A$ is pure gauge. It only holds locally. There could be global obstructions. In fact, topological obstructions could happen even if the gauge group $G$ is Abelian.
More details:
The starting point is a connected (but not necessarily simply connected) gauge Lie group $G$ and a globally defined gauge potential $A$ on a connected (but not necessarily simply connected) spacetime manifold $M$. In this answer the covariant derivative is by convention $\mathrm{D}=\mathrm{d}-A$, i.e. $A$ is typically an anti-Hermitian matrix-valued 1-form. A gauge transformation takes the form $$A^{\prime}~=~-U(\mathrm{D} U^{-1}), \qquad U~\in~G.\tag{1}$$
Let us warm up by reviewing the easy way. If $A^{\prime}$ is pure gauge $A^{\prime}=-U(\mathrm{d} U^{-1})$, then there exists a gauge transformation (1) such that the new gauge potential $A=0$ vanishes identically, and hence the (new and old) field strengths $F^{\prime}=UFU^{-1}=0$ vanish identically.
Next let us return to OP's question and sketched the proof of the opposite implication in a simply connected region $\Omega\subseteq M$ containing a fiducial point $x_{0}\in M$:
For a point $x\in \Omega$ choose a path/curve $C$ from $x_0$ to $x$.
Define group element via a Wilson line$^1$ $$ U(x,x_0)~:=~P e^{\int_{C} \!A},\tag{2}$$ where $P$ denotes path ordering.
Next use the non-Abelian Stokes' theorem to argue that this definition (2) does not depend on the curve $C$, because $F=0$.
Finally, use the group-valued section (2) to gauge transform the gauge potential $A$ to be zero.
Example: Consider the punctured plane $M=\mathbb{R}^2\backslash\{(0,0)\}$ with coordinates $$ \begin{align}x~=~&r\cos\theta, \qquad y~=~r\sin\theta, \cr \theta~\sim~&\theta+2\pi,\qquad r~>~0;\end{align}\tag{3}$$ and with Abelian gauge group $G=U(1)$.
Let the (imaginary valued) gauge potential 1-form be $$-iA~=~\frac{x\mathrm{d}y-y\mathrm{d}x}{x^2+y^2}~=~\mathrm{d}\theta~=~-iU^{-1}\mathrm{d}U,\tag{4}$$ where $$U(x,y)~=~e^{i\theta(x,y)}~\in~G\tag{5}$$ is a globally well-defined group-valued section. The field strength $F$ vanishes, so the gauge potential (4) is pure gauge. However, if we scale $A\to \lambda A$ in eq. (4) with a non-integer constant $\lambda\in\mathbb{R}\backslash\mathbb{Z}$, then $A$ will no longer be pure gauge (because the corresponding $U=e^{i\lambda\theta}$ becomes multivalued), but $F$ will still be zero.
--
$^1$ If $G$ is not simply connected then work in the universal covering group $\tilde{G}$. We can always later project down to $G$.