Mathematical Physics – Understanding the Direct Sum of Hilbert Spaces

hilbert-spacemathematical physicsmathematicsquantum mechanics

I am a physicist who is not that well-versed in mathematical rigour (a shame, I know! But I'm working on it.) In Wald's book on QFT in Curved spacetimes, I found the following definitions of the direct sum of Hilbert spaces. He says –

Next, we define the direct sum of Hilbert spaces. Let $\{ {\cal H}_\alpha \}$ be an arbitrary collection of Hilbert spaces, indexed by $\alpha$ (We will be interested only with the case where there are at most a countable number of Hilbert spaces, but no such restriction need be made for this construction). The elements of the Cartesian product $\times_\alpha {\cal H}_\alpha$ consist of the collection of vectors $\{\Psi_\alpha\}$ for each $\Psi_\alpha \in {\cal H}_\alpha$. Consider, now, the subset, $V \subset \times_\alpha {\cal H}_\alpha$, composed of elements for which all but finitely many of the $\Psi_\alpha$ vanish. Then $V$ has the natural structure of an inner product space. We define the direct sum Hilbert space $\bigoplus\limits_\alpha {\cal H}_\alpha$ to be the Hilbert space completion of $V$. It follows that in the case of a countable infinite collection of Hilbert spaces $\{{\cal H}_i\}$ each $\Psi \in \bigoplus\limits_i {\cal H}_i$ consists of arbitrary sequences $\{\Psi_i\}$ such that each $\Psi_i \in {\cal H}_i$ and $\sum\limits_i \left\| \Psi_i \right\|_i^2 < \infty$.

Here $\left\| ~~\right\|_i$ is the norm defined in ${\cal H}_i$. Also, Hilbert space completion of an inner product vector space $V$ is a space ${\cal H}$ such that $V \subset {\cal H}$ and ${\cal H}$ is complete in the associated norm. It is constructed from $V$ by taking equivalence classes of Cauchy sequences in $V$.

Now the questions –

1. Why does $V$ have the structure of an inner product space?

2. How does he conclude that $\sum\limits_i \left\| \Psi_i \right\|_i^2 < \infty$?

3. How does this definition of the direct sum match up with the usual things we see when looking at tensors in general relativity or in representations of Lie algebras, etc.?

PS – I also have a similar problem with Wald's definition of a Tensor Product of Hilbert spaces. I have decided to put that into a separate question. If you could answer this one, please consider checking out that one too. It can be found here. Thanks!

Best Answer

Both of the other answers here do not follow Wald's approach directly, though they are still roughly correct. Here is an answer which exactly follows his directions, and shows most of the details. Based on the fact that you're talking about QM, I assume you are using complex Hilbert spaces, though there's no substantive difference for real Hilbert spaces.

In my answer, I've assumed some familiarity with real analysis, but not functional analysis (since this is a basic construction from functional analysis). If you don't know what a Cauchy sequence in a metric space is, you'll want to familiarize yourself with that before starting on part 2. Also, this answer really has no physics; it's all functional analysis (the study of infinite dimensional vector spaces).

1. Why does $V$ have the structure of an inner product space?

He has defined $V$ as the subset of $\times_\alpha {\cal H}_\alpha$ where all but finitely many $\Psi_\alpha$ vanish. That is to say, a generic element of $V$ is a choice of $\Psi_\alpha$ for every alpha, where the only nonzero elements are $\Psi_{\alpha_1}, \Psi_{\alpha_2}, \ldots, \Psi_{\alpha_n}$ for some positive integer $n$ and some choice of indices $\alpha_1, \ldots, \alpha_n$. Note that different elements $\Psi$ and $\Psi'$ will in general correspond to different positive integers $n$, $n'$ and different indices $\alpha_1, \ldots, \alpha_n$ and $\alpha'_1, \ldots, \alpha'_{n'}$. For the rest of this section, I will take $\Psi$ and $\Psi'$ to be two generic elements of $V$ given by the above formulas.

We want to put an inner product on $V$. Normally, for finite direct sums, we'd want to do something like $\langle \Psi | \Psi' \rangle = \displaystyle \sum_\alpha \langle \Psi_\alpha | \Psi'_\alpha \rangle_{\alpha}$, where $\langle \cdot | \cdot \rangle_\alpha $ is the inner product on $\mathcal H_\alpha$. But we have an infinite sum, which is meaningless until we can find a way to interpret it.

Luckily, this isn't really an infinite sum, precisely because of the restriction we made. All but finitely many of the terms are 0. For any $\alpha$ to contribute a nonzero term, it must be a member of both the set of $\alpha_i$'s and the set of $\alpha'_j$'s. Let's label these indices by $\beta$, so that $\{\beta_k\} = \{\alpha_i\} \cap \{\alpha'_j\}$. Now we can make sense of the above sum by restricting it to only those indices that contribute nonzero terms, so that $\langle \Psi | \Psi' \rangle = \displaystyle \sum_{\beta_k} \langle \Psi_{\beta_k} | \Psi'_{\beta_k} \rangle_{\beta_k}$. You could just as well sum over either the $\alpha_i$ or $\alpha'_j$ indices, which would still be a finite sum of course, but I prefer making it manifestly (conjugate) symmetric this way.

You have to check at this point that the axioms for an inner product are satisfied by this choice of $\langle \cdot | \cdot \rangle $. It's clearly symmetric up to a complex conjugate, which is what we want for a Hermitian inner product. The fact that it is linear requires some playing around with symbols if you want to prove, though the direction of the proof should be immediately obvious if you've followed the above. Positive-definiteness is inherited from the $\mathcal H_\alpha$. Specifically, $\| \Psi \|^2 = \langle \Psi | \Psi \rangle = \displaystyle \sum_{\alpha_i} \langle \Psi_{\alpha_i} | \Psi_{\alpha_i} \rangle_{\alpha_i} = \displaystyle \sum_{\alpha_i} \| \Psi_{\alpha_i} \|^2_{\alpha_i}$ which is a finite sum of $n$ positive terms by positive-definiteness of the inner product on each of the direct summands, so $\| \Psi \|^2 > 0$ (so long as $n \ne 0$, but $n=0$ is the case where $\Psi = 0$ which of course has norm $0$).

You might say "I just want countable direct sums, so can't I just take the limit of partial sums?". You can perfectly well do that, but it doesn't work for uncountable direct sums (which you may not need). You also need to prove that the way you order the $\alpha$ doesn't change the final result for any inner product, and all sorts of other technical details. In the end, those technical details amount to just about everything else we need to do anyway for the general case. If you want to do this, you'll also want to follow the note in the next section.

2. How does he conclude that $\sum\limits_i \left\| \Psi_i \right\|_i^2 < \infty$?

Let's be clear that now we're working with a countable direct sum, so the summands can be chosen as $\mathcal H_i$ for $i=1, 2, \ldots$. This is still (potentially) an infinite collection, but now we can write a general element of $\times_i {\cal H}_i$ as a sequence $\Psi = (\Psi_1, \Psi_2, \ldots)$. $\Psi \in V$ would mean that all but finitely many of the $\Psi_i$ are $0$, or equivalently, that there is some index $M$ such that for all $i > M$, $\Psi_i = 0$.

Let's also be clear on what he's saying here. He's claiming that a generic element of $\bigoplus_i \mathcal H_i$, the completion of $V$ with respect to the inner product $\langle \cdot | \cdot \rangle$, is any sequence $\Psi = (\Psi_1, \Psi_2, \ldots)$ so that $\sum\limits_i \left\| \Psi_i \right\|_i^2 < \infty$s. $\Psi$ here is not, in general, an element of $V$, but of its completion as a Hilbert space.

Note: If any part of what follows confuses you, it would not be a grave error to think of the definition of $\bigoplus_i \mathcal H_i$ a countable direct sum of Hilbert spaces as the subspace of $\times_i \mathcal H_i$ consisting of all those vectors $\Psi$ in the latter which satisfy $\sum\limits_i \left\| \Psi_i \right\|_i^2 < \infty$. This is how it is defined in some texts and Wikipedia here. Wiki has both this definition and Wald's here, where it notes that they are equivalent without proof. The full proof that this is a Hilbert space is on this ProofWiki page, though some of it is essentially duplicated below. The important thing is that we want the inner product to be finite and complete, so that what we get is actually a Hilbert space, and this condition ensures it.


I'll construct the completion of $V$ in this section. If you already understand this, or give up on understanding it, feel free to skip to the next section, which is where I get to your actual question. This is the hardest part of the answer in my opinion, and it's where you really need to know a bit of real analysis. However, there's no physics here so we'll skip a bit of the technicalities.

We want to complete $V$ with respect to the inner product $\langle \cdot | \cdot \rangle$, or more specifically with respect to the metric on $V$ defined by the inner product, which is given by $d(\Psi, \Psi') = \| \Psi - \Psi' \|= \sqrt{\langle \Psi - \Psi' | \Psi - \Psi' \rangle}$. We need to do this because we still want a Hilbert space, and Hilbert spaces come with complete inner products. In finite dimensions, any inner product is complete, but in infinite dimensions this isn't true. There are Cauchy sequences, which we would hope would converge, but they actually don't. For example, for a moment, let's let $\mathcal H_i = \mathbb C$ for each $i$. Then the vector $(1,1/2,1/3,1/4, \ldots)$ isn't in $V$ since it has infinitely many nonzero entries. However, we can write a sequence of elements of $V$ which "ought to" converge to this vector. $(1, 0, 0, 0 \ldots), (1, 1/2, 0, 0, \ldots), (1, 1/2, 1/3, 0, 0, \ldots), \ldots$. This is a Cauchy sequence, but it doesn't converge in $V$. So we need to add stuff to $V$ to make this sequence (and others like it) converge, so that what we get is a Hilbert space.

The completion of $V$ follows exactly the same steps as the completion of any metric space in terms of equivalence classes of Cauchy sequences. The most familiar case of this is probably the construction of the real numbers from the rational numbers, which is also the most pathological case since we normally need to invoke the completeness of the real numbers, which is not an option when you're constructing real numbers for the first time. However, there's no such difficulty in this case. Wikipedia covers this reasonably well.

In any case, I think the most instructive thing for the special case of Hilbert spaces (as well as the least obvious) is to see why the completion of $V$ can be viewed as a subspace of $\times_i {\cal H}_i$. A vector in the completion of $V$ is an equivalence class of Cauchy sequences. Let's pick a particular representative sequence for that class. To avoid notation getting overloaded, I'll write this Cauchy sequence in $V$ as $\Psi^1, \Psi^2, \Psi^3, \ldots$. Each element $\Psi^j$ of this sequence is itself an element of $V$, so it's a sequence of the form $(\Psi^j_1, \Psi^j_2, \ldots)$, of which only finitely many terms are nonzero. The sequence is Cauchy, so for every $\epsilon > 0$, there is some $N$ so that for any choice of $j,k > N$, $\| \Psi^j - \Psi^k \| < \epsilon$. In particular, that means that for each choice of $i$, $\| \Psi^j_i - \Psi^k_i \| < \epsilon$, so for each $i$, the sequence $(\Psi_i^1, \Psi_i^2, \ldots)$ is a Cauchy sequence in $\mathcal H_i$.

Now, $\mathcal H_i$ is, by assumption, a Hilbert space. That means that this Cauchy sequence $(\Psi_i^1, \Psi_i^2, \ldots)$ converges to something in $\mathcal H_i$. Let's call that something $\Psi_i$. We do this for each $i$. Let's put all of these $\Psi_i$ into a vector $\Psi = (\Psi_1, \Psi_2, \ldots)$, which can only reasonably be said to live in $\times_i {\cal H}_i$ at the moment. Considering that the Cauchy sequence $\Psi^1, \Psi^2, \ldots$ is supposed to be converging in the completion of $V$, and component-wise it converges to $(\Psi_1, \Psi_2, \ldots) = \Psi$, we can naturally identify this Cauchy sequence with the vector $\Psi$, which is a function from the set of Cauchy sequences in $V$ to $\times_i {\cal H}_i$.

There are some things to be checked. These are all important to do rigorous math, but for physics purposes you may conclude that they're not so important. The intuition is all in the above work; this is just additional work to make sure that what we're doing makes sense, though it is also a decent way to check and make sure you understand what's going on. We need to check that two Cauchy sequences in the same equivalence class get sent to the same $\Psi$. This will ensure that the function can be viewed as one from the completion of $V$ to $\times_i {\cal H}_i$ (or that it factors through the completion of $V$ in more mathematical terminology). We also need to check that it's a linear map. Third, we need to check that an element of $V$ viewed as a constant Cauchy sequence gets sent back to the same element of $V$, so that the natural embedding of $V$ in the completion of $V$ and in $\times_i {\cal H}_i$ are compatible. And we need to show that the kernel of this linear map is trivial, so that no two distinct vectors in the completion of $V$ are mapped to the same thing in $\times_i {\cal H}_i$.

With all of that done, it makes sense to identify the completion of $V$ with a subset of $\times_i {\cal H}_i$ as a vector space, since we have a vector space isomorphism between the two which is compatible with the copy of $V$ that both of them contain. So now, we can say that the completion of $V$ is (essentially) a subspace of $\times_i {\cal H}_i$. There's more structure on the completion of $V$, though, namely that it's a Hilbert space, so it has an inner product. The one final math-y thing to check is that the inner product has a nice form on the completion of $V$ in $\times_i {\cal H}_i$. In fact, it's just $\langle \Psi | \Psi' \rangle = \displaystyle \sum_{i=1}^\infty \langle \Psi_i | \Psi'_i \rangle_i$, where the infinite sum of complex numbers is interpreted in the usual way as a limit of partial sums. This sum is guaranteed to converge (absolutely) to some finite value for any pair $\Psi, \Psi'$ that we pick so long as they both lie in the completion of $V$ in $\times_i {\cal H}_i$.

With all that out of the way, let's return to calling it $\bigoplus_i \mathcal H_i$, rather than the completion of $V$. After all, that's what we're talking about, but I avoided that language until now since we didn't exactly know what it meant yet.


With that now out of the way, it's not so hard to see why $\Psi = (\Psi_1, \Psi_2, \ldots)$ satisfying $\sum\limits_i \left\| \Psi_i \right\|_i^2 < \infty$ is a necessary and sufficient condition for $\Psi$ to be in $\bigoplus_i \mathcal H_i$. It is necessary because for $\Psi \in \bigoplus_i \mathcal H_i$, we need $\langle \Psi | \Psi \rangle = \displaystyle \sum_{i=1}^\infty \| \Psi_i \|^2$ to be defined, so $\displaystyle \sum_{i=1}^\infty \| \Psi_i \|^2 < \infty$ is required.

For sufficiency, we want to construct a Cauchy sequence in $V$ converging to $\Psi$ in $\bigoplus_i \mathcal H_i$ (the completion of $V$) to show that every element satisfying this inequality must be in $\bigoplus_i \mathcal H_i$. This is pretty easy. Let $\Psi^1 = (\Psi_1, 0, 0, 0 \ldots), \Psi^2 = (\Psi_1, \Psi_2, 0, 0, \ldots), \Psi_3=(\Psi_1,\Psi_2,\Psi_3,0,\ldots)$ and so on. Note that all of these $\Psi^j$ are in $V$ since each has only $j$ nonzero elements. So long as the sequence is Cauchy, it clearly converges to $\Psi$. The statement that $\Psi^1, \Psi^2, \ldots$ is Cauchy is just that for any $\epsilon > 0$, there is some index $N$ such that for every choice of $j,k > N$, $\| \Psi^j - \Psi^k \|^2 < \epsilon^2$ (I've squared both sides for convenience). Without loss of generality, we can take $j > k$, in which case this statement simplifies to $\| \Psi_{k+1} \|_{k+1}^2 + \cdots + \| \Psi_{j} \|_{j}^2 > \epsilon^2$. Now, the left hand side is always less than the infinite sum in the limit $j \rightarrow \infty$ and where $k = N$, so we want to find $N$ so that $\displaystyle \sum_{l=N}^\infty |\Psi|_l^2 < \epsilon^2$. Since $\epsilon$ was positive, so is $\epsilon^2$. It's a general theorem in advanced calculus that if a series of nonnegative terms satisfies $\sum_{l=1}^\infty a_l < \infty$, then for any positive constant (which we'll take to be $\epsilon^2$ here), there is some tail of the sequence which has sum less than that constant, i.e. there exists an $N$ so that $\displaystyle \sum_{l=N}^\infty a_l < \epsilon^2$. Applying this here with $a_l = \| \Psi_l \|^2_l$, we see that we get such an $N$, so this is a Cauchy sequence and $\Psi$ is in $\bigoplus_i \mathcal H_i$ and life is good.


You may be wondering why it took me many paragraphs to explain what Wald does in one short paragraph. The reason (I believe) is that Wald is implicitly assuming that for some readers, this will be familiar from functional analysis, and in any case it isn't important enough to devote a section to since it's of little physical relevance.

3. How does this definition of the direct sum match up with the usual things we see when looking at tensors in general relativity or in representations of Lie algebras, etc.?

It doesn't really match up at all. Direct sums aren't related to tensor products at all. They are much more closely related to direct products. In fact, they're the same when the product is over a finite index set, but for infinite index sets we have to start switching things around. Infinite tensor products on Hilbert spaces tend to be even uglier things that luckily we can usually avoid.

You probably do have experience with direct sums/products over finite index sets (where they are the same). For instance, in E&M and GR and other courses*, you learn that a general rank-2 tensor can be decomposed into a scalar part representing the trace, an antysymmetric part, and a symmetric traceless part. In 4 dimensions, this is decomposing a 16-dimensional vector space into the direct sum (or product) of a 1-dimensional space, a 6-dimensional space, and a 9-dimensional space. The construction Wald does puts together an infinite number of spaces, so it's more complicated.

Another elementary example of this is when we study angular momentum addition. This is a case of finite direct sums of Hilbert spaces, exactly what we are looking for. You probably remember that, for instance, the tensor product of a spin-2 system with a spin-1 system can be decomposed as a spin-3 part, a spin-2 part, and a spin-1 part via Clebsch-Gordon coefficients. If $\bar n$ denotes the Hilbert space for a spin-n system, this is just the statement that $\bar 2 \otimes \bar 3 = \bar 1 \oplus \bar 2 \oplus \bar 3$. This is a finite direct sum, but you can imagine that for an infinite number of particles you could get an infinite direct sum. This "infinite number of particles" isn't physical (though we do have many models such as the Ising model which approximate finite systems e.g. in condensed matter as infinite ones), but it does get physical when you start thinking of the "particles" as localized excitations that can be at any point in space. That's probably why this comes up in Wald's book, though I don't actually know as I have not read it.

The direct product construction is also common when studying Lie algebras, which you asked about. The direct product of Lie algebras $\mathfrak g$ and $\mathfrak h$ is just their direct product as vector spaces, where we impose that elements of $\mathfrak g$ commute with those of $\mathfrak h$. So it is pairs $(x,y) \in \mathfrak g \times \mathfrak h$ with $[(x_1,y_1),(x_2,y_2)] = ([x_1,x_2],[y_1,y_2])$. You could extend this product to infinitely many factors $\mathfrak g_i$ if you really wanted to. If you had a Hilbert space structure on each $\mathfrak g_i$, you could also define their direct sum \bigoplus_i \mathfrak g_i$ as above, which would inherit both the Hilbert space structure and the Lie algebra structure, but it would not be what we typically mean when we speak of infinite direct sums of Lie algebras (see next paragraph). Such a construction isn't related to any physics I can think of at the moment.

One of the difficulties in finding a simple analogue is that it has to be Hilbert spaces. The direct sum in the category of Hilbert spaces is totally different from that in the category of vector spaces. As vector spaces (forgetting the Hilbert space structure), $\bigoplus_i \mathcal H_i = V$, not the completion of $V$. Everything in #2 about analytic stuff just stops being relevant. So, it's very likely that you haven't needed infinite direct sums of Hilbert spaces before, because in QFT on flat spacetime, as well as ordinary QM, they can almost always be avoided.


I would also not suggest getting too bothered by it. In physics, we usually try to keep our direct sums finite. When they do go infinite, we tend to ignore the analytic difficulties as much as possible, and leave the fully rigorous analysis up to the mathematicians. I suspect there are plenty of practicing theoretical physicists who could not give you a rigorous definition for the infinite direct sum without looking at anything first. Not that it's particularly hard, but the subtle differences between infinite sums and products of Hilbert spaces just don't matter much for practical theoretical physics. A lot of calculations would be done in $V$ or in $\times_i \mathcal H_i$ rather than in $\bigoplus_i \mathcal H_i$. They typically turn out all right, except in rare cases where the answers are clearly nonsensical. It should be stressed that the construction of the direct sum is literally the only possible reasonable choice for a direct sum of Hilbert spaces, so if you're worried that the analytic differences are going to lead to some physically relevant issues depending on whether you chose one way of defining it or another, in some philosophical sense that can't happen assuming what you're doing is consistent at all.

That doesn't mean that I think you should ignore it, but don't let it confuse you. When you see an infinite direct sum of Hilbert spaces, think of it as just an infinite version of a finite direct sum until you absolutely can't afford to any more. To do rigorous mathematical physics, you'll need the distinction, but for most theoretical physics you won't need to be too careful about it. Wald is pretty mathematical, which is not necessarily a bad thing, but you shouldn't confuse the mathematics for the physics, and there's no physical content here.

*In reality, some of the above about tensors in GR, E&M, etc. is a convenient lie. We're actually typically interested in tensor fields (which physicists usually just call tensors), which are (typically smooth) sections of some bundle on a spacetime manifold. So the tensor fields themselves are elements of infinite-dimensional vector spaces. We usually invoke some sort of locality principle so that we only have to deal with tensors at a point, which then form a finite-dimensional space. The space of sections isn't the direct sum of the individual spaces, but it is another way of putting them together which is sort of intuitively additive in nature. However, it isn't a Hilbert space, even if each of the individual spaces is a Hilbert space. Only if you assume the spacetime manifold is compact, or put some analytic conditions on the sorts of sections we allow, can you get a finite-valued norm by integrating the inner products over the whole manifold. This is an alternate approach to doing global constructions than the one above, more in the flavor of classical field theory.

Related Question