[Physics] Examples of density operators $\rho=\sum\limits_n p_n|\phi_n\rangle\langle\phi_n|$ in which the states $\{|\phi_n\rangle\}$ are not orthogonal

density-operatorquantum mechanicsquantum-informationquantum-statisticsstatistical mechanics

The set of quantum states $\{|\phi_n\rangle\}$ in the definition of the density operator $$\rho=\sum\limits_n p_n|\phi_n\rangle\langle\phi_n|$$ need not be orthonormal, and need not form a basis. But unfortunately, in the examples that I have seen so far, the states $\{|\phi_n\rangle\}$ were both orthonormal and forms a basis.

Example 1 In the Stern-Gerlach (SG) set-up, the state of the silver atoms coming out of the oven and before passing through the magnetic field, is imperfectly known because $S_z$ remained unmeasured. Therefore, on the ignorance ground, such an ensemble will be represented by $$\rho=\frac{1}{2}(|{\uparrow}\rangle\langle{\uparrow}|+|{\downarrow}\rangle \langle{\downarrow}|).\tag{1}$$ Note that, in this case, the states $|{\uparrow}\rangle$ and $|{\downarrow}\rangle$ are orthonormal and forms the $S_z$-basis.

Example 2 Consider an unpolarized light moving in the z-direction so that its polarization must be in the $xy$-plane. Since we do not know the state vector, it is described by the density operator $$\rho=\frac{1}{2}(|x\rangle\langle x|+|y\rangle\langle y|)\tag{2}$$ where $|x\rangle$ and $|y\rangle$ describe plane polarized states along the $x$ and $y$-axes respectively.


Question Can someone suggest an example of a mixed ensemble where the states $\{|\phi_n\rangle\}$ need not be orthonormal and need not form a basis? I'm not looking for the trivial example where the desity operator describes a pure state.

Best Answer

This thread has seen a ton of incorrect statements coming from a number of sides, so it's probably a good idea to set the record straight in a bit more detail, and to provide some more examples of how expressions of this form come up in practice.

So, let's go through a brief rundown of some pertinent points.

  • The definition of a density matrix is just an operator $\rho:\mathcal H \to \mathcal H$ that is self-adjoint and positive semidefinite (and trace class if $\dim(\mathcal H)=\infty$), and whose trace satisfies $$\mathrm{Tr}(\rho)=1.$$ More importantly, this is all that's required by the definition. Any operator that satisfies those conditions can legitimately be called a density matrix, period.

  • Because of that, all operators that can be expressed in the OP's form, $$ \rho = \sum_n p_n |\phi_n \rangle\langle \phi_n|, \tag{$*$}$$ are valid density matrices so long as the component projectors are normalized to $\langle \phi_n|\phi_n \rangle =1$ and the weigths add up to $\sum_n p_n = 1$.

  • Those two requirements are the only actual requirements. None of the conditions for density-matrix-ness ($\rho^\dagger=\rho$, $\rho\geq 0$, and $\mathrm{Tr}(\rho)=1$) are impacted if the $|\phi_n\rangle$ are not pairwise orthogonal, or if their number exceeds the state space's dimension. That means that it's perfectly fine to take non-orthogonal states in a representation of the form $(*)$.

  • Explicit examples with non-orthogonal projectors are trivial to construct. Norbert Shuch's answer contains one example, but if you go looking for them you can build them instantly by just taking any collection of unit-normalized vectors weighted by unit-normalized weights $p_n$.

    To provide one such example explicitly, consider the two-level space $\mathcal H = \mathbb C^2$, and a sequence of $N$ vectors lying equispaced along the equator of its Bloch sphere, giving $$ \rho = \sum_{n=0}^{N-1} p_n |\varphi_n\rangle\langle \varphi_n| \quad \text{for} \quad |\varphi_n\rangle = \frac{1}{\sqrt{2}} \bigg( |0\rangle + e^{i 2\pi n/N} |1\rangle\bigg). \tag{$\star$} $$ Here the weights can be arbitrary so long as $\sum_{n=0}^{N-1} p_n=1$; one obvious choice is $p_n = 1/N$ which gives the maximally-mixed state $\rho = \frac12 \mathbb I$, but there's plenty of other possible choices.

  • Representations of the form $(*)$ are not unique. Suppose, say, that you have some density matrix $\rho$ that you've managed to represent as a sum of normalized projectors in two different ways, say, $$ \rho = \sum_n p_n |\phi_n \rangle\langle \phi_n| = \sum_m q_m |\chi_m \rangle\langle \chi_m|, \tag{$**$}$$ where $\sum_n p_n = 1 = \sum_m q_m$ and $\langle \phi_n|\phi_n \rangle =1=\langle \chi_m|\chi_m \rangle$. Then there are some loose requirements on the two sets of vectors, starting with the fact that $\mathrm{span}\{|\phi_n\rangle\}$ needs to match $\mathrm{span}\{|\chi_m\rangle\}$, but in general, the layout of the $|\phi_n\rangle$ and the $|\chi_m\rangle$ within that span can be very different. This is evident in the example $(\star)$ above with equal weights, where $\rho$ is independent of the number $N$ of vectors in your collection, and it can also be represented as $\rho = \tfrac12 \left[ |0\rangle\langle 0| + |1\rangle\langle 1| \right]$.

  • Representations of the form $(*)$ are interpretations, and little more. There is some physical content in the statement $$ \rho = \sum_n p_n |\phi_n \rangle\langle \phi_n|, \tag{$*$}$$ namely, that you can produce the system state $\rho$ by producing the pure states $|\phi_n\rangle$ with probabilities $p_n$ and then forgetting which pure state you actually produced. However, the operative word there is "can": the fact that that procedure will produce $\rho$ does not say, at all, that it is the only possible procedure that will produce that state.

  • Representations do not imply that the vectors involved are eigenvectors of the resultant density matrix. That's true if the projectors are pairwise orthogonal, but that's not a requirement at all, so it is perfectly possible to construct $\rho$ as a sum of projectors that have nothing to do with the sum's eigenprojectors.

    It's probably helpful to illustrate this with an explicit example, for clarity. Consider a two-level system that's prepared in a superposition of the form $$ |\theta_\pm\rangle = \cos(\theta/2)|0\rangle \pm \sin(\theta/2)|1\rangle,$$ i.e. an angle $\theta$ down from the north pole of the Bloch sphere, except that each time we flip a fair coin to see which sign of $\theta$ (i.e. which direction on the prime meridian) we take. Then the density matrix reads \begin{align} \rho & = \frac12 \bigg( |\theta_+\rangle\langle\theta_+| +|\theta_-\rangle\langle\theta_-| \bigg) \\ & = \frac12 \bigg( \big(\cos(\theta/2)|0\rangle + \sin(\theta/2)|1\rangle \big) \big(\cos(\theta/2)\langle 0| + \sin(\theta/2)\langle 1| \big) \\ & \qquad + \big(\cos(\theta/2)|0\rangle - \sin(\theta/2)|1\rangle \big) \big(\cos(\theta/2)\langle 0| - \sin(\theta/2)\langle 1| \big) \bigg) %\\ & = \frac12 \bigg( %\big(\cos^2(\theta/2)|0\rangle\langle 0| + \sin(\theta/2)\cos(\theta/2)|1\rangle %\langle 0| + \sin(\theta/2)\cos(\theta/2)|0\rangle \langle 1| + %\sin^2(\theta/2)|1\rangle\langle 1| \big) %\\ & \qquad + %\big(\cos^2(\theta/2)|0\rangle\langle 0| - \sin(\theta/2)\cos(\theta/2)|1\rangle %\langle 0| - \sin(\theta/2)\cos(\theta/2)|0\rangle \langle 1| + %\sin^2(\theta/2)|1\rangle\langle 1| \big) % \bigg) \\ & = \cos^2(\theta/2)|0\rangle\langle 0| + \sin^2(\theta/2)|1\rangle\langle 1| \end{align} because the off-diagonal terms cancel out. In this second representation, we do have orthogonal projectors, so here $|0\rangle$ and $|1\rangle$ are indeed the unique eigenvectors of $\rho$ (unless $\theta=\pi/2$ and $\rho$ is maximally mixed). But that doesn't stop our initial representation, $\rho = \frac12 \left( |\theta_+\rangle\langle\theta_+| +|\theta_-\rangle\langle\theta_-| \right)$, with its non-orthogonal, non-eigenvector components, from also being true.

  • If a state is built up using non-orthogonal projectors, then it also has a separate representation in terms of orthogonal projectors, and that's perfectly fine. Representations of the form $(*)$ are a dime a dozen if you know where to look. So, you found one that's not the canonical one: great! there's millions where that one came from.

  • Representations of the form $(*)$ really are a dime a dozen. If you want to build one yourself, say, for a two-level system, there's a few points that are particularly relevant to the recipe:

    • The Pauli matrices are a basis for all valid density matrices, i.e. if $\rho=\rho^\dagger$ is traceless, then it can be represented as $$ \rho = \tfrac12 \mathbb I + \vec p \cdot \vec \sigma,$$ where $\vec p = (p_x,p_y,p_z)\in \mathbb R^3$ and $\vec \sigma =(\sigma_x, \sigma_y, \sigma_z)$ are the Pauli matrices. (Further, that relationship can be inverted via $\vec p = \mathrm{Tr}(\rho\vec\sigma)$.)
    • The positivity condition $\rho\geq 0$ translates into the condition $||\vec p||\leq 1$, i.e. $\vec p$ lives inside the unit ball or its boundary $-$ generally known as the Bloch ball and the Bloch sphere in this context.
    • If $|\vec p|=1$, i.e. $\vec p$ is on the Bloch sphere boundary, then $\rho = |\psi\rangle\langle\psi|$ is a pure state, and if you write $|\psi\rangle = \cos(\theta/2) |0\rangle + e^{i\varphi}\sin(\theta/2)|1\rangle$ (which you always can) then $\theta\in [0,\pi]$ and $\varphi\in[0,2\pi)$ are the polar and azimuthal spherical coordinates for $$ \vec p = (\sin(\theta)\cos(\varphi), \sin(\theta)\sin(\varphi), \cos(\theta).$$
    • The relationship between $\vec p$ and $\rho$ is linear and bijective.
    • If $\rho_1$ and $\rho_2$ are valid density matrices, then any convex combination $$ \rho = q_1 \rho_1 + q_2 \rho_2$$ of the two, with weights adding to $q_1+q_2=1$, is also a valid density matrix.
    • Because the relationship between density matrices and Bloch-ball vectors is linear, any convex combination of density matrices translates directly into a convex combination of the corresponding Bloch-ball vectors. Thus, if $ \rho_1 = \tfrac12 \mathbb I + \vec p_1 \cdot \vec \sigma,$ $ \rho_2 = \tfrac12 \mathbb I + \vec p_2 \cdot \vec \sigma,$ and $ \rho = q_1 \rho_1 + q_2 \rho_2$, then $ \vec p= q_1 \vec p_1 + q_2 \vec p_2$ lies on the line that goes from $\vec p_1$ to $\vec p_2$, a fraction $q_1=1-q_2$ of the way in that direction.

    So, what does this mean for density-matrix representations? If you have a target density matrix $\rho$ that you want to represent, simply take its Bloch-ball vector $\vec p = \mathrm {Tr}(\rho\vec\sigma)$, and then pick $N$ points $\vec p_n$ on the Bloch sphere itself (the boundary) and weights $q_n$ (normalized to $\sum_n q_n=1$) such that their average $\sum_n q_n \vec p_n=\vec p$ gives you your chosen point. That will then naturally give you a representation of your density matrix as a weighted sum of $N$ pure-state projectors, and you can read off the computational-basis components directly from the spherical coordinates of your chosen extremal points.

Related Question