[Physics] Casimir Operators and the Poincare Group

group-representationslie-algebralorentz-symmetrypoincare-symmetryspecial-relativity

Following along in QFT (Kaku) he introduces the Casimir Operators (Momentum squared and Pauli-Lubanski) and claims that the eigenvalues of the operators characterize the irreducible representations of the Poincare Group. How exactly does this correspondence between eigenvalues and irreducible representations work?

Best Answer

Whew. This answer got super out of hand, so buckle in if you're not familiar with Lie groups, Lie algebras, and representation theory. I'll summarize the main points here, and then embark on a long and winding road to explain.

  1. Every Lie group $G$ corresponds to a Lie algebra $\frak g$ which is isomorphic to $T_eG$ - the tangent space to $G$ at the identity element.
  2. Given a linear representation $\rho$ which maps elements of $G$ to the endomorphisms (i.e. matrices, if $V$ is finite dimensional) over some vector space $V$, the Lie algebra $\frak g$ inherits a representation given by $d\rho$, the "differential map" of $\rho$.
  3. It's possible to construct a Casimir elements $\Omega$ from a basis $X_i$ of $\frak g$ and symmetric bilinear form called the Killing form on $\frak g$. This Casimir element commutes with all elements of $\frak g$. Note that the number of distinct Casimir operators is equal to the rank of $\frak g$.
  4. We can apply our representation $d\rho$ of $\frak g$ to the Casimir element $\Omega$. The resulting matrix $d\rho(\Omega)$ commutes with all matrices $d\rho(X)$ for all $X\in \frak g$.
  5. Schur's Lemma tells us that if $\rho$ is irreducible, then $d\rho(\Omega) = \lambda_\rho \cdot \mathbb{1}$ where $\lambda_\rho$ is determined by the dimensionalities of $\frak g$ and $V$.
  6. Therefore, each irreducible representation (up to isomorphisms) corresponds to a different value of $\lambda_\rho$, making the $\lambda$'s useful labels to distinguish meaningfully different representations.

Representations of Lie Groups

Given a Lie group $G$ and a vector space $V$, a representation $\rho : G \rightarrow GL(V)$ is a map which takes elements of $G$ to the space of linear endomorphisms on $V$ subject to the constraint

$$\forall g_1,g_2\in G: \rho(g_1\cdot g_2) = \rho(g_1) \circ \rho(g_2) $$

where $\cdot$ denotes composition in the group, and $\circ$ denotes composition of endomorphisms.

If $V$ is a finite dimensional vector space of dimension $d$ over a field (say, $\mathbb{C}$), then its elements can be written as $d$-dimensional column vectors, and the endomorphisms on $V$ can be written as $d\times d$ matrices.

Given a representation $\rho$, it may be possible to find some subspace $W\subseteq V$ such that $\rho(g) w \in W$ for all $g\in G$ and $w\in W$ - such subspaces are called invariant subspaces with respect to $\rho$. It's easy to see that for every representation, $V$ itself is such a subspace, as is the space containing only the zero vector $\{0_V\}$ - these are called the trivial invariant subspaces.

If $V$ and $\{0_V\}$ are the only such subspaces (that is, if there are no non-trivial invariant subspaces w.r.t. $\rho$), and if $\rho$ is not the trivial representation which maps every $g\in G$ to $0_V$, then we call $\rho$ an irreducible representation of $G$. In the finite dimensional case, such representations correspond to block diagonal matrices.


Example 1: Consider the group $SO(2)$ - the 2D rotation group. Each group element corresponds to rotation in the plane by some angle $\theta\in[0,2\pi)$. We want to find a representation of $SO(2)$ over the vector space $\mathbb{R}^2$ (i.e. the 2D column vectors with real elements).

It's not difficult to show that $$\rho: SO(2) \rightarrow GL(\mathbb{R}^2)$$ $$ g_\theta \mapsto \pmatrix{\cos(\theta) & -\sin(\theta) \\ \sin(\theta) & \cos(\theta) }$$

constitutes a representation of $SO(2)$, and that it is irreducible.


Example 2: Everything is exactly the same, except now we consider the vector space $\mathbb{R}^3$, with the representation

$$\rho: SO(2) \rightarrow GL(\mathbb{R}^3)$$ $$ g_\theta \mapsto \pmatrix{1 & 0 & 0 \\ 0 & \cos(\theta) & -\sin(\theta) \\ 0 & \sin(\theta) & \cos(\theta) }$$

This also constitutes a representation of $SO(2)$, but notice that the subspace $W\subset \mathbb{R}^3$ defined by $$ W := \left\{ \pmatrix{0 \\ x \\ y} : x,y \in \mathbb{R}\right\}$$

is invariant under the action of $\rho(g)$ for all $g\in G$ - it therefore constitutes a non-trivial invariant subspace, and so $\rho$ is not an irreducible representation.


Lie Algebras <=> Lie Groups

Recall that given a manifold $M$, we can construct a tangent space $T_pM$ for each $p\in M$. The elements of such a space are called tangent vectors, which we can (somewhat naively) think of as tiny little arrows glued to the point $p$.

More specifically, given a smooth curve $\gamma:\mathbb R \rightarrow M$ such that $\gamma(0)=p\in M$ (for convenience), we can create an object called $X_{p,\gamma}$ which acts on smooth functions to produce real numbers in the following way:

$$ X_{p,\gamma} : C^\infty(M) \rightarrow \mathbb R $$ $$ f \mapsto X_{p,\gamma}f := (f \circ \gamma)'(0)$$

From $f$, we can define a differential map which takes tangent vectors to the real numbers: $$ df : T_p M \rightarrow \mathbb R$$ $$ X_p \mapsto df(X_p) := X_p f$$

This seems to be notational redundance, but note the difference - the former definition defines a map from the space of smooth functions on $M$ to the real numbers, while the latter defines a map from the space of tangent vectors (at a point $p\in M$) to the real numbers.

Here's the point. Given a Lie group G (which is, in particular, a smooth manifold), we can construct the tangent space at the identity element, $T_eG$. Roughly speaking, the elements of $T_eG$ constitute a Lie algebra (note that Lie algebras, in principle, need not have anything whatsoever to do with Lie groups). Therefore, each Lie group $G$ corresponds to a Lie algebra $\frak g$, which is isomorphic to the tangent space $T_eG$.

Representations of Lie Algebras

From there, given a representation $\rho:G \rightarrow GL(V)$, we can construct a differential map

$$d\rho : {\frak{g}} \rightarrow GL(V)$$

It follows that, given some representation $\rho$ of $G$, the associated Lie algebra $\frak g$ inherits a representation over the same vector space.


Example 3a: Consider again the Lie group $SO(2)$. We can define a smooth curve on $SO(2)$ defined as follows:

$$\gamma : \mathbb{R} \rightarrow SO(2)$$ $$\theta \mapsto g_\theta$$

where $g_\theta$ represents the group element corresponding to rotation by an angle $\theta$.

Next, we define the tangent vector $X_{e,\gamma}$ to this curve at the identity element by its action on some general function $f\in C^\infty(SO(2))$: $$X_{e,\gamma} : C^\infty(SO(2)) \rightarrow \mathbb R$$ $$f \mapsto (f\circ \gamma)'(0)$$

For instance, consider the function $f: g_\omega \rightarrow 3\omega$. Then we would have that $$ X_{e,\gamma}f = (f\circ \gamma)'(0) = 3$$

By considering all such curves, we construct all such tangent vectors, and so we have constructed the tangent space $T_eG$.


Example 3b: Now consider the representation $\rho:SO(2) \rightarrow GL(\mathbb R^2)$ as discussed in Example 1 above. We can construct the differential map $$ d\rho : T_eG \rightarrow GL(\mathbb R^2)$$ $$ X_{e,\gamma} \mapsto X_{e,\gamma}\rho = (\rho \circ \gamma)'(0)$$

Notice that $$ \rho \circ \gamma : \mathbb R \rightarrow GL(\mathbb R^2)$$ $$ \theta \mapsto \pmatrix{\cos(\theta) & -\sin(\theta) \\ \sin(\theta) & \cos(\theta)}$$

and so

$$d\rho(X_{e,\gamma}) = (\rho \circ \gamma)'(0) = \pmatrix{0 & -1 \\ 1 & 0}$$

We therefore see that the Lie algebra ${\frak{so}}(2)$ inherits a representation $d\rho$ from the representation $\rho$ of the corresponding Lie group $SO(2)$.


Lie Algebras and the Killing Form

Lie Algebras are called algebras because, in addition to a vector space structure, they are equipped with an antisymmetric, bilinear bracket (sometimes referred to as a commutator bracket) $$[\cdot,\cdot] : {{\frak g}} \times {{\frak g}} \rightarrow {{\frak g}}$$ such that $$[X,Y] = -[Y,X]$$ and $$[X,[Y,Z]]+[Z,[X,Y]]+[Y,[Z,X]] = 0_{{\frak g}}$$

Typically one defines this bracket by the correspondence between the tangent space $T_eG$ and the space of left-invariant vector fields on $G$. I won't get into this - one can imagine (perhaps perversely) that the Lie algebra inherits this bracket from the commutators of (one of) its faithful matrix representations (though this is obviously bass-ackwards).

Anyway, ${{\frak g}}$ is a vector space, so we can always find a nice basis set for it - we'll call this basis $X_i, i=1\ldots$dim($\frak g$). We can write the commutation relations of the basis elements as

$$[X_i,X_j] = C^k_{\ ij} X_k$$

where the $C$'s are called the structure coefficients of $\frak g$.

Given any symmetric, nondegenerate form

$$ B:{{\frak g}} \times {{\frak g}} \rightarrow \mathbb C$$

we can define a dual space ${\frak g}^*$ whose elements map elements of $\frak g$ to complex numbers. Given some $X,Y \in \frak g$, we define the dual vector $\omega_X :{\frak g} \rightarrow \mathbb C$ as follows:

$$ \omega_X (Y) := B(X,Y)$$

Notice that this is precisely the prescription we use to define covectors from vectors in relativity, where the symmetric, nondegenerate form is chosen to be the metric tensor, and we (somewhat sloppily) refer to the mapping between vectors and covectors as "raising/lowering indices."

Of course, there is no clear notion of a metric tensor here, so in its place one typically chooses the Killing Form

$$K:{\frak g}\times{\frak g} \rightarrow \mathbb{C}$$ $$(X,Y) \mapsto Tr\big[ ad_X \circ ad_Y\big]$$

where $ad_X (Y) = [X,Y]$ is the adjoint map to $X$.

Consider a general Lie algebra $\frak g$ with commutation relations $$ [X_i,X_j]=C^k_{\ i j}X_k$$

Let $A = a^i X_i$ and $B=b^i X_i$ be elements of $\frak g$. Then because of its bilinearity, we have

$$ K(A,B) = a^i b^j K(X_i,X_j) \equiv a^i b^j K_{ij}$$ The components $K_{ij}$ can be found as follows. Let $ad_{X_i} \circ ad_{X_j}$ act on basis element $X_\alpha$:

$$\big(ad_{X_i} \circ ad_{X_j}\big)X_\alpha = ad_{X_i}\big([X_j,X_\alpha]\big) = ad_{X_i}\big(C^k_{\ j \alpha} X_k\big) = C^k_{\ j \alpha}[X_i,X_k] = C^k_{\ j \alpha} C^{\beta}_{\ i k} X_\beta$$

To compute the trace, we set $\beta=\alpha$ and sum:

$$K_{ij} = C^k_{\ j \alpha} C^\alpha_{\ i k} $$


Example 4: The Lie algebra $\frak{so}(3)$ has a canonical basis $X_i$ which has commutation relations

$$[X_i,X_j] = \epsilon^k_{\ ij}X_k$$

The components of the Killing form with respect to this basis are

$$K_{ij} = \epsilon_{\beta j \alpha}\epsilon_{\alpha i \beta} = -\epsilon_{j \alpha \beta} \epsilon_{i \alpha \beta} = -2\delta_{ij}$$


The Killing form has another interesting property - it is adjoint-invariant, by which I mean that for all $Z\in \frak g$,

$$K(ad_Z X,Y) + K(X,ad_Z Y) = 0$$

You can demonstrate this to yourself from the definition of $K$.

Dual Bases and Casimir Elements

As stated above, we can use the Killing form to define a dual space to $\frak g$. Such a space has canonical basis elements $\omega^i$ defined such that

$$\omega^i(X_j) = \delta^i_j$$

That is, given the set $\{X_i\}$, there exists another set $\{Y_i=\lambda^m_i X_m\}$ such that

$$K(Y_i,X_j)= \lambda^m_i K_{mj} = \delta_{ij}$$

We then define $$\omega^i := K(Y_i,\bullet)$$


Example 5: We found in Example 4 that the Killing form on our chosen basis of ${\frak so}(3)$ has components $K_{ij}=-2\delta_{ij}$. It therefore follows that the dual basis to $X_i$ with respect to $K$ is

$$\omega^i = K(Y_i,\bullet)$$ where $$Y_i = -\frac{1}{2}X_i$$


Once we have our dual basis constructed, we can define the quadratic Casimir element:

$$\Omega := \sum_i X_i \omega^i$$

We have defined $\Omega$ in terms of a basis and an adjoint-invariant form $K$ - it is a useful exercise to demonstrate that $\Omega$ is independent of our choice in basis, but does depend on our choice of $K$.

$\Omega$ has a very special property - it commutes with every element of $\frak g$, in the following sense: Given any basis elements $X_j,X_k$,

$$[\Omega,X_j]X_k = \Omega X_j X_k - X_j \Omega X_k = X_i\omega^i X_j X_k - X_j X_i\omega^i X_k = X_jX_k - X_j X_k = 0 $$

Represention of a Casimir Element

Given a representation $\rho:G \rightarrow GL(V)$, we constructed an inherited representation $d\rho:{\frak g} \rightarrow GL(V)$.

We can extend this notion to a representation of the dual basis $\omega^i$. Recall that $\omega^i(X_j)=\delta^i_j$. We define the representation of the dual basis such that

$$d\rho(\omega^i) =d\rho(Y_i)$$ where $$ \omega^i = K(Y_i,\bullet)$$

We then define the Casimir invariant corresponding to $\rho$ to be

$$d\rho(\Omega):= d\rho(X_i)\circ d\rho(\omega^i)=d\rho(X_i)\circ d\rho(Y_i)$$

This is a crucial point. A Casimir element $\Omega$ is constructed from the Lie algebra $\frak g$ and the Killing form over $\frak g$. It has nothing to do with any representation at all, and is an element of the Universal Enveloping Algebra $U({\frak g})$.

However, given a representation $\rho$ of $G$ and the corresponding representation $d\rho$ of $\frak g$, we can construct a Casimir invariant $d\rho(\Omega)$ which is an element of $GL(V)$. This object is representation dependent.


Example 6: A canonical representation of the Lie algebra ${\frak{so}}(3)$ over $\mathbb R^3$ is $$\rho(X_1) = \pmatrix{0&0&0\\0&0&-1\\0&1&0}, \rho(X_2) = \pmatrix{0&0&1\\0&0&0\\-1&0&0}, \rho(X_3) = \pmatrix{0&-1&0\\1&0&0\\0&0&0} $$

The corresponding dual basis is such that $Y_i = -\frac{1}{2}X_i$. Therefore,

$$\rho(X_1)\circ\rho(Y_1) = -\frac{1}{2}\pmatrix{0&0&0\\0&-1&0\\0&0&-1}$$

Continuing from here, it follows that the Casimir invariant corresponding to $\rho$ is

$$\rho(\Omega) = \pmatrix{1&0&0\\0&1&0\\0&0&1} = \mathbb{1}_V$$


End of the Story: Schur's Lemma

In Example 6, we found that the Casimir element was simply the identity matrix $\mathbb 1$. This is no accident. Schur's Lemma goes as follows:

Let $\rho:{\frak g}\rightarrow GL(V)$ be a linear representation of $\frak g$ over $V$. If $\rho$ is irreducible and $\Omega$ is a Casimir element defined on $U({\frak g})$, then $\rho(\Omega) = \lambda_\rho \cdot \mathbb{1}\in GL(V)$ for some number $\lambda_\rho$.

This number $\lambda$ can be computed explicitly - it turns out that

$$\lambda_\rho = \frac{dim({\frak g})}{dim(V)}$$

which corresponds to our example, in which $dim({\frak g})=dim(V) = 3$.