I think that the problematic part here is the notion of what demands what. For example, you state
This is the standard argument by which Lorentz invariance is found to demand gauge invariance for massless particles.
but I am not completely sure if I agree with this, or at least to the interpretation you are carrying with it.
Taking a look at Weinberg's book on QFT[1], Chapter 5.9, I see a different way of looking at this issue. Weinberg notices that there is no consistent way of constructing an operator for the spin-1 massless field. The equivalent to your statement is that if we are to forcefully construct that operator, it has to be equivalent to itself in addition to a derivative of some other operator.
Now, in my interpretation (and I don't think that at the time of this reply there is an unanimous and unambiguous interpretation of this) is that the statement of gauge invariance is that of a single massless spin-1 field is only defined up to a sum of a divergence, and the important part is that this is a statement which is true even in the absence of matter coupled to the field.
My understanding of your question leads me to think that you are putting at the same footing two distinct notions:
- The intrinsic gauge invariance of a single massless spin-1 field, which is required to make sense of the corresponding operator.
- The gauge invariance of an interacting theory with matter coupled to a massless spin-1 field.
I point out this distinction because point 2. is the one that forces you to put a collection of massless spin-1 fields in the Adjoint representation, such that the kinetic term of matter with covariant derivatives is invariant under a gauge transformation of matter fields, whereas point 1. tells you that each of the massless spin-1 fields has an intrinsic gauge invariance where each is identical to itself plus the addition of a divergence.
I then propose you an interpretation that would solve this issue, and it goes like this:
- a) We know that in order to make sense of spin-1 massless fields, their operators have to enjoy a gauge invariance of the form
$$\hat A_\mu \to \hat A_\mu + \partial_\mu \hat \Omega $$
as such we allow for the classical counterpart to enjoy this freedom from the first place (point 1.).
- b) Due to the previous point, we notice that theories of local $U(1)$ symmetries of the matter fields
$$ \psi \to e^{i \theta(x)} \psi $$
coupled to matter
$$\mathcal L = i \overline \psi \gamma \cdot \partial \psi + \overline \psi \gamma \cdot A \psi $$
are immediately invariant as the new terms arising from the derivative terms, the divergence of the phase $\theta$, can be gauged away into the field $A_\mu$.
- c) For non-abelian theories, while it's feasible possible to construct them without being interacting with matter, their motivation comes from empirical evidence where they interact with matter. In such formulations we know that the theory is invariant if there is a collection of massless spin-1 fields that transform under the adjoint transformation (this is point 2.).
- d) Furthermore, because of the presence of derivatives, there will still be a divergence of a lie-algebra valued quantity (the phases) which can be gauged away by the collection of the massless spin-1 fields due to their intrinsic gauge invariance (point 1.).
In some sense you can read point c) as being what some people related closely to the global (a lot of grains of salt in this term) part of the gauge transformation, while d) the local part. In the end the theory works because the requirement for an interacting gauge theory plays along nicely with the quantum mechanical requirements for writing down the corresponding massless spin-1 fields!
Would this help you? I have thought about similar issues (I wrote, for example, [2] even though it's not direclty related to your issue) and I've come to believe that the interpretation I constructed and provided above solves a lot of conceptual problems in gauge theories (namely the simple Lie group type).
So to answer your questions explicitly:
- Each field operator (meaning each of the $a$, using your index notation) in the non-abelian cause transforms just as the abelian case under Lorentz transformations, that is with an inhomogeneous derivative term.
- The collection of fields (all the $a$) transforms in accordance with Coleman-Mandula as it transforms under the adjoint representation, plus each $a$ has the "freedom" to be defined up to a derivative. This freedom plays nicely in presence of matter where derivatives of phases appear, and in absence of matter can be used to write each operator in a convenient gauge.
References
[1] Quantum Theory of Fields - Vol 1, S. Weinberg
[2] Conceptual Challenges of Gauge Symmetry, Miguel Crispim Romão
https://www.academia.edu/5659587/Conceptual_Challenges_of_Gauge_Symmetry
Best Answer
I) The gauge transformation of the real gauge field $V$ reads
$$ e^{\widetilde{V}} ~=~e^Xe^Ve^Y, \qquad X~:=~i\Omega^{\dagger}, \qquad Y~:=~-i\Omega. \tag{1}$$
We next use the following BCH formulas
$$ e^Xe^V~\stackrel{\rm BCH}{=}~e^{V+B({\rm ad} V)X+{\cal O}(X^2)}, \qquad e^Ve^Y~\stackrel{\rm BCH}{=}~e^{V+B(-{\rm ad} V)Y+{\cal O}(Y^2)}.\tag{2} $$
Keeping only linear orders in $\Omega$, we get
$$\begin{align}\widetilde{V}~&\stackrel{(1)+(2)}{=}~B({\rm ad} V)X+V+B(-{\rm ad} V)Y\cr &~~~\stackrel{(4)}{=}~V+\frac{1}{2}[V,Y-X]+B_+({\rm ad} V)(X+Y),\end{align}\tag{3} $$
where
$$\begin{align} B(x)&~:=~\frac{x}{e^x-1}~=~\sum_{m=0}^{\infty}\frac{B_m}{m!}x^m~=~B_+(x)-\frac{x}{2}\cr &~=~1-\frac{x}{2}+\frac{x^2}{12}-\frac{x^4}{720}+\frac{x^6}{30240}+{\cal O}(x^8)\end{align} \tag{4} $$
and
$$\begin{align} B_+(x) &~:=~\frac{B(x)+B(-x)}{2}~=~\frac{x/2}{\tanh\frac{x}{2}} \cr &~=~1+\frac{x^2}{12}-\frac{x^4}{720}+\frac{x^6}{30240}+{\cal O}(x^8) \end{align} \tag{5} $$
are generating functions of Bernoulli numbers.
II) We would like $\widetilde{V}$ to be in WZ gauge
$$ \widetilde{V}~=~{\cal O}(\theta^2) .\tag{6} $$
For given $V$, $\widetilde{V}$, and $X-Y$, the eqs. (3+6) is an affine$^1$ equation in $X+Y=i\Omega^{\dagger}-i\Omega$. This has formally a solution if the operator
$$ B_+({\rm ad} V)~=~{\bf 1} + \ldots \tag{7} $$
is invertible, which is true, at least perturbatively. To finish the proof, one should write out the equation in its superfield components to check that the above affine shift mechanism really is realized at the component level. Recall e.g. that the gauge field $\widetilde{V}$ can not be gauged away completely (= put to zero), since $\Omega$ is a chiral superfield with not enough $\theta$'s to reach all components of $\widetilde{V}$, so to speak.
References:
--
$^1$ An affine equation is a linear equation with an inhomogeneous term/source term.