Linear Algebra – Clean Proof of Baker-Campbell-Hausdorff Formula

inductionlie-algebraslinear algebraquantum mechanics

I am thinking of the cleanest way to prove the BCH formula and I have come up with this.

First, work out $e^{\lambda A}Be^{-\lambda A}$ by expanding the exponentials (sums go from $0$ to $\infty$):
$$\left(\sum_{n}\frac{\lambda^n}{n!}A^n \right)B\left(\sum_{k}\frac{(-\lambda)^k}{k!}A^k \right).$$
This can be written as
$$\sum_{n,k}\frac{(-1)^k\lambda^{n+k}}{n!k!}A^nBA^k.$$
We define $m=n+k$, and rewrite the previous expression as
$$\sum_{m=0}^{\infty}\sum_{n=0}^m\frac{(-1)^{m-n}\lambda^{m}}{n!(m-n)!}A^nBA^{m-n}.$$
By dividing and multiplyling by $m!$ inside the sum we finally arrive to
$$\sum_{m=0}^{\infty}\frac{\lambda^m}{m!} \sum_{n=0}^m(-1)^{m-n}\frac{m!}{n!(m-n)!}A^nBA^{m-n}.$$

The formula is usually presented as
$$e^{\lambda A}Be^{-\lambda A}=B+\lambda[A,B]+\frac{\lambda^2}{2!}[A,[A,B]]+…$$

By comparing with what I got, proving BCH is reduced to proving
$$\underbrace{[A,[A,[…,[A,B]…]}_{m}=\sum_{n=0}^m(-1)^{m-n}\frac{m!}{n!(m-n)!}A^nBA^{m-n}.$$

At this point I thought this could be easily proven by induction, but now I'm not sure it is that simple. The equation is true for $m=1$, and if we assume it is true for $m$, then we get
$$\underbrace{[A,[A,[…,[A,B]…]}_{m+1}=A\underbrace{[A,[A,[…,[A,B]…]}_{m}-\underbrace{[A,[A,[…,[A,B]…]}_{m}A$$
$$=A\left(\sum_{n=0}^m(-1)^{m-n}\frac{m!}{n!(m-n)!}A^nBA^{m-n}\right)-\left(\sum_{n=0}^m(-1)^{m-n}\frac{m!}{n!(m-n)!}A^nBA^{m-n}\right)A$$
$$=\sum_{n=0}^m(-1)^{m-n}\frac{m!}{n!(m-n)!}(A^{n+1}BA^{m-n}-A^nBA^{m+1-n})$$
$$=\sum_{n=0}^m(-1)^{m+1-n}\frac{m!}{n!(m-n)!}(A^nBA^{m+1-n}-A^{n+1}BA^{m-n}).$$

I would like to see this is equal to
$$\sum_{n=0}^{m+1}(-1)^{m+1-n}\frac{(m+1)!}{n!(m+1-n)!}A^nBA^{m+1-n},$$
since that would complete the proof. I've tried to work it by inserting commutators here and there, but the algebra becomes too involved. Any help would be truly appreciated.

Maybe the last expression is more transparent if read as
$$\sum_{n=0}^{m+1}(-1)^{m+1-n}\begin{pmatrix}m+1\\n \end{pmatrix}A^nBA^{m+1-n}$$

Best Answer

Start with $f(\lambda):=e^{\lambda A}Be^{-\lambda A}$ but take derivatives at $\lambda=0$: $$\begin{align} f(0) &= B \\ f'(0) &= \left( e^{\lambda A}ABe^{-\lambda A} + e^{\lambda A}B(-A)e^{-\lambda A} \right)_{\lambda=0} = \left. e^{\lambda A}[A,B]e^{-\lambda A}\right|_{\lambda=0} = [A,B] \\ f''(0) &= \left( e^{\lambda A}A[A,B]e^{-\lambda A} + e^{\lambda A}[A,B](-A)e^{-\lambda A} \right)_{\lambda=0} = \left. e^{\lambda A}[A,[A,B]]e^{-\lambda A}\right|_{\lambda=0} = [A,[A,B]] \\ \vdots\\ f^{(k)}(0) &= \underbrace{[A,[A,\cdots,[A,B]\cdots]]}_{[A,\cdot]\text{ applied $k$ times}} = [A,\cdot]^k B \end{align}$$ The last identity can be proved by induction.

This gives $$ f(\lambda) = \sum_{k=0}^{\infty} \frac{1}{k!}\lambda^k f^{(k)}(0) = \sum_{k=0}^{\infty} \frac{1}{k!}\lambda^k [A,\cdot]^k B = e^{\lambda[A,\cdot]} B . $$

Related Solutions

[Math] Baker Campbell Hausdorff formula for unbounded operators

Let $\mathfrak{g}$ be a Lie algebra, and let $G$ be the (unique up to isomorphism) simply-connected group with Lie algebra $\mathfrak{g}$. Here, $\mathfrak{g}$ and $G$ should be thought as "abstract", with the exponential map $\exp : \mathfrak{g} \rightarrow G$ defined in terms of the 1-parameter sub-groups in $G$. The BCH formula is fulfilled for this "abstract" exponential map, although the completly general proof is a bit lengthy. It involves deriving a formula for the derivative of the exponential mapping, then proving $\exp \big( [X,\cdot] \big) Y = \exp (X) Y \exp (-X)$ and finally deducing the BCH formula. A concise presentation can be found eg. in chapter 3 of Brian Hall's "Lie Groups, Lie Algebras, and Representations". This reference aims at being fairly accessible, at the price of restricting itself to those Lie goups $G$ which can be constructed as matrix subgroups: this is the case of most groups used in practice (in particular the Heisenberg group generated by the position/momentum operators of quantum mechanics, together with the identity operator, can be constructed as a subgroup of 3x3 real matrices).

Next, let $X \mapsto i\hat{X}$ be a representation of this Lie algebra by unbounded symmetric operators on a complex Hilbert space $\mathcal{H}$ (note that we do not even need to assume that $\hat{X}$ be essentially self-adjoint, just symmetric).

Since we are dealing with unbounded operators, we have to be a bit careful as to what we mean by a representation, because a priori $\hat{X}\hat{Y}$, $\hat{Y}\hat{X}$, and therefore $[\hat{X}, \hat{Y}]$ may fail to be well-defined. Specifically, we demand that there exists a common invariant dense domain $\mathcal{D} \subseteq \mathcal{H}$, such that, for all $X \in \mathfrak{g}$, $\hat{X}$ is defined on $\mathcal{D}$ and stabilizes $\mathcal{D}$ (ie. $\hat{X} \left\langle \mathcal{D} \right\rangle \subseteq \mathcal{D}$). Then, we can define arbitrary products of the $\hat{X}$'s on $\mathcal{D}$, and in particular commutators.

The question is then whether this representation can be exponentiated, ie:

for all $X \in \mathfrak{g}$, $\hat{X}$ is actually essentially self-adjoint;
and there exists a unitary representation $U \mapsto \hat{U}$ of the group $G$ on $\mathcal{H}$ such that, for any $X \in \mathfrak{g}$, $\widehat{\exp X} = \widehat{\exp} (i\hat{X})$. Here, I denote by $\widehat{\exp}$ the "operator" exponential map, which is defined by spectral resolution of the essentially self-adjoint operator $\hat{X}$: as you observed, in the case of unbounded operators, the exponential cannot be defined in terms of the exponential series (note that the notation $\widehat{\exp}$ is non-standard, I use it here only to prevent confusion between the various notions of exponential maps).

If this holds, the BCH formula satisfied by the "abstract" exponential map $\exp$ will be inherited by the "operator" exponential map $\widehat{\exp}$.

So, with all these preliminaries in place: when can a Lie algebra representation by symmetric unbounded operators be exponentiated? A sufficient condition (useful in practice, albeit not a necessary condition) is the Nelson criterion (lemma 9.1 of Edward Nelson, "Analytic Vectors", Annals of Mathematics, Second Series, Vol. 70, No. 3 (Nov., 1959), pp. 572-615): it demands that there exists a basis $X_1,\dots,X_n$ of $\mathfrak{g}$, a dense subdomain $\mathcal{D}_o \subseteq \mathcal{D}$, and a real $s > 0$, such that: $$ \forall \psi \in \mathcal{D}_o,\; \sum_{m=0}^{\infty} \frac{s^m}{m!} \sum_{k_1,\dots,k_m} \left\| X_{k_1} \dots X_{k_m} \psi \right\| < \infty. $$ This is a fairly technical result, but the intuition behind it is that this condition is precisely what you need to define $\widehat{\exp} (i\hat{X})$ over $\mathcal{D}_o$ via the exponential series, and, from there, deduce that $\hat{X}$ is indeed essentially self-adjoint (using the Stone's theorem discussed below), that this definition of $\widehat{\exp} (i\hat{X})$ coincides over $\mathcal{D}_o$ with the spectral one, and finally, that this indeed gives you a unitary representation of $G$ (matching the "abstract" BCH-formula with the one that can be proven directly using the exponential series).

In the case of the position/momentum operators of quantum mechanics, one can for example take $X_1 = \text{id}$, $X_2 = q$, $X_3 = p$ and take $\mathcal{D} = \mathcal{D}_o$ to be spanned by finite linear combination of the harmonic osciallator energy eigenstates. Then, the condition can be proven using the expression of the position/momentum operators in terms of ladder operators.

However, coming back to your specific motivation, I do not honestly think that you need all this machinery. Instead, you can use the explicit expression of the Weyl operators to prove that $t \mapsto W(tz)$ is a strongly continuous one-parameter unitary group: ie. $W(sz) W(tz) = W\big((s+t)z\big)$ and, for any $\psi \in \mathcal{H}$, $t \mapsto W(tz) \psi$ is continuous (with respect to the norm of $\mathcal{H}$; in this case, the $L_2$-norm). Then, Stone's theorem (theorem VIII.8 of Reed and Simon, "Methods of Modern Mathematical Physics", volume 1) tells you that $W(tz) = \widehat{\exp} (it\hat{X})$ with $\hat{X}$ the self-adjoint operator defined by:

$$i \hat{X} \psi = \left. \frac{d}{dt} W(tz) \psi \right|_{t=0}$$

for any $\psi$ in the dense domain $\mathcal{D}$ where this derivative exists. Using again the explicit expression of $W(tz)$ you can then check that $\hat{X} = \sqrt{2} (y\hat{q} - x\hat{p})$.

Campbell-Baker-Hausdorff Theorem Proof from Stillwell

This answer only tried to give a picture vision of the situation, hoping that things become clear already. After the answer there is a cited reference, Free Lie Algebras, which is the better answer. (Because it is a structural answer, and the structure is beautiful, just switch to the reference and enjoy!)

First, in my pictural opinion a Lie polynomial in the alphabet with letters $A,B,C,D,\dots$ is a homogeneous polynomial in the non-commutative algebra generated by the monoid generated by these letters, we will work over $\Bbb Q$, that can be obtained in the following way.

First fix some letters (with possible repetitions) (from the alphabet) and some order, and put them in a row. For instance;

A B A C A D B A

Now decide to build a "special" tree with these nodes as leaves, going "down", so decide which two neighbor letters should be Lie-condensed first, then use this as a "new letter", and go on recursively. One picture may be:

A   B   A  C   A   D   B   A
 \   \ /    \   \ /   /   /
  \   *      \   *   /   /
   \ /        \ /   /   /
    *          *   /   /
     \          \ /   /
      \          *   /
       \          \ /
        \          *
         \        /
          \      /
           \    /
            \  /
             \/
            FINAL RESULT

Each * means to get the joined nodes, and apply [ , ] on them. I hope it is clear.

Now observe that $$ \begin{aligned} Z &= F(A,B) \\ &=\log(e^Ae^B) \\ &=\log\left(\ \left(1+\frac 1{1!}A+\frac 1{2!}A^2+\dots\right) \left(1+\frac 1{1!}B+\frac 1{2!}B^2+\dots\right) \ \right) \\ &=\log\left(\ 1+\sum_{(j,k)\ne (0,0)} \frac 1{j!k!}A^jB^k \ \right) \\ &= 0+\underbrace{(A+B)}_{F_1(A,B)}+\dots \end{aligned} $$ has the $F_1$-part equal to $A+B$, a Lie polynomial, and the further homogeneous pieces are under attack.

Back to the question. Why is $F_i\left(A,\sum_j F_j(B,C)\right)$ inductively a Lie polynomial (for $i,j>1$)? Use new letters $D_j$ instead of $F_j(B,C)$ if this makes the things simpler, and let us make the picture of $F_i(A, \sum _j D_j)$. There are many terms that are involving tree collapsing rules as above in a linear combination. Now push each $\sum D_j$ from the sum down on its piece from $F_i$, till it hits the *, i.e. it is involved in building a Lie bracket. This bracket is linear, so we split the sum $\sum D_j$ into pieces, and work now with an individual $D_j$.

If this $D_j$ is itself (inductively) given by such Lie bracket tree collapsing rules, then we are fine, formally we "move the rule to the top.

We only have problems with $F_1$, which is not really in the range of the Lie bracket tree collapsing rules. I cannot say more.

(I could not figure out which / where is the problem with the "exceptional polynomials", since working only with the homogeneous part of degree $(n+1)$, for instance for $i=1$, $j=n$, and conversely, LHS, $$ \begin{aligned} F_1(F_n(A,B),C) &=F_n(A,B)+C\ , \\ F_n(F_1(A,B),C) &=F_n(A+B,C)\ , \end{aligned} $$ and of course, now we have to start the proof.)

I am now saying some words about the hidden structure, it is a wonderful structure, enjoy it!

In the book version of Free Lie Algebra, Christophe Reutenauer, referenced fully also in Free Lie Algebras, wiki page the author is quickly introducing a structure of a Hopf algebra on $\Bbb Q\langle\langle A, B,\dots\rangle\rangle$, the free algebra on the monoid generated by the letters $A,B,\dots$, the one multiplication is the usual one, an other one is given by the shuffle product, so for instance $A \sqcup\!\!\!\sqcup B = AB-BA=[A,B]$, so shuffle product monomials are... Lie polynomials. There are two corresponding comultiplications, and using these constructions one can state structural properties. In the list of them, relevant for the present question:

Theorem 1.4 in the book, (not in the linked pdf,) characterizes a polynomial $P$ to be Lie polynomial in the following way, there are equivalent:
- $P$ is a Lie polynomial,
- Define $ad(P)$ by $ad(P)(Q)=[P,Q]=PQ-QP$. One can now take this setting only for generators $A,B,\dots$ of the alphabet, so $ad(A)=[A,-]$, and extend to an algebra mapping, so for instance $Ad(AB)=Ad(A)Ad(B):=ad(A)ad(B)$. The equivalent condition for a $P$ is then $ad(P)=Ad(P)$.
- $P$ is primitive, a structural property in a Hopf algebra.
- $P$ has no free coefficient and the derivative of $P$ conincides with the "right bracketing" of $P$.
Theorem 3.1. is a version of the above for Lie series.
Lemma 1.7 in the book, let $\alpha $ be the antipode, mapping a word $w$ into $\pm$ the reversed word, the sign being captured from the parity of the length. Then for a Lie polynomial $P$ we have $\alpha(P)=-P$.
Theorem 3.2 in the book, let $S=1+\dots$ be a series, higher terms omitted, then there are equivalent:
- $\log(S)$ is a Lie series,
- $S$ is group-like, i.e. $\delta(S)=S\otimes S$,
- the map $w\to (S,w)$ is a homomorphism from the shuffle algebra to $\Bbb Q$,
- $Ad(S)(T)=STS^{-1}$.

Corollary 3.3, the series $S=1+\dots$, such that $\log S$ is Lie series, are building a group under multiplication, this is because of the group-like-property.

Corollary 3.4, $\log(e^Ae^B)$ is a Lie series. Because of the stability w.r.t. the multiplication above, and note that $\log e^A=A$, $\log e^B=B$, are Lie series.

Best Answer

Related Solutions

[Math] Baker Campbell Hausdorff formula for unbounded operators

Campbell-Baker-Hausdorff Theorem Proof from Stillwell

Related Question