[Physics] What’s the difference between classical and quantum vector superposition

hilbert-spacequantum mechanicsquantum-statessuperpositionvectors

$(1)$Since quantum-mechanical states between two consecutive measurements are represented as superposition of orthonormal basis vectors in a vector space, at first glance it seems like it makes sense to talk about a state like $ \frac{1}{√2}|{\uparrow}{\rangle}+\frac{1}{√2}|{\downarrow}{\rangle}$ as something completely distinct from either $|{\uparrow}{\rangle}$ or $|{\downarrow}{\rangle}$ (which form the orthonormal basis), sort of like any $3$-vector can be written as a sum of unit basis vectors multiplied by scalars and yet is distinct from any of them.

$(2)$However, you can often find a description of superposited states as a system being in both basis states at the same time (e.g. the spin being simultaneously up and down), but I have never found a similar description regarding vector quantities (for example velocity being along $z$ axis and $x$ axis at the same time if the vector was somewhere between these axes) in classical mechanics, despite both situations being a result of addition of the basis vectors.

Why? Is the problem more complicated than what I deduced in $(1)$?

Best Answer

This is all just a result of sloppy language on the part of people describing quantum mechanics. The state $$ \left\lvert \Psi \right\rangle = \frac{1}{\sqrt{2}} \left( \left\lvert \uparrow \right\rangle + \left\lvert \downarrow \right\rangle\right) \tag{1}$$ is a superposition of the two orthogonal states $\left\lvert \uparrow \right\rangle$ and $\left\lvert \downarrow \right\rangle$. The state is unlike either basis vector alone. A velocity vector $$\left\lvert v \right\rangle = a\left\lvert x \right\rangle + b\left\lvert y \right\rangle \tag{2}$$ for some values $a$ and $b$ is also a superposition of two orthogonal velocity vectors. It is unlike either basis vector alone.

Talking about $\left\lvert \Psi \right\rangle$ as "simultaneously in both states" is just plain sloppy. It's a superposition. It's not like either basis vector alone. It is, as you say, something completely distinct.

However, you can often find a description of superposited states as a system being in both basis states at the same time (e.g. the spin being simultaneously up and down), but I have never found a similar description regarding vector quantities (for example velocity being along z axis and x axis at the same time if the vector was somewhere between these axes) in classical mechanics, despite both situations being a result of addition of the basis vectors.

The reason for this disagreement in language comes from the fact that, in the end, quantum state vectors tell you probabilities of experimental outcomes. It really bugs people to think of the state of a physical system being fundamentally probabilistic. When it comes to measurement, the state $\left\lvert \Psi \right\rangle$ means that the system has a 1/2 probability to be measured spin up and and 1/2 probability to be measured spin down. People don't naturally think about the world around them in terms superposition states whose coefficients correspond to probability amplitudes. They'd rather think about the classical states independently and try to form some kind of notion of the system existing in combinations of classical states. Therefore, they naturally (but erroneously) say that the system is in both classical states at the same time, when really, as you said, the system is in a state that's completely different from either classical basis state.

Related Solutions

[Physics] Understanding the difference between co- and contra-variant vectors

Contravariant vectors are "standard" vectors. Covariant vectors are linear applications on contavariant vectors producing scalars.

Let us start form the former case. If you fix a couple of bases $\{e_i\}_{i=1,\ldots,n}$ and $\{e'_i\}_{i=1,\ldots,n}$ in the finite dimensional vector space $V$ with dimension $n$, such that $e_i = \sum_j {A^j}_i e'_j$ for set of coefficients ${A^j}_i$ forming a (necessarily) non-singular matrix $A$, you have for a given vector $v \in V$: $$v = \sum_i v^i e_i = \sum_j v'^j e'_j$$ and thuso $$\sum_i v^i \sum_j {A^j}_i e'_j = \sum_j v'^j e'_j$$ so that: $$\sum_j \left( \sum_i {A^j}_i v^i\right) e'_j = \sum_j v'^j e'_j\:.$$ Uniqueness of components of $v$ respect to $\{e'_i\}_{i=1,\ldots,n}$ eventually entails: $$v'^j = \sum_i {A^j}_i v^i\qquad \mbox{where}\quad e_i = \sum_j {A^j}_i e'_j\tag1$$ This is nothing but the standard rule for transforming components of a given contravariant vector when one changes the decomposition basis.

Let us pass to consider covariant vectors. As I said above, a covariant vector is nothing but a linear map $\omega : V \to R$ ($R$ can be replaced by $C$ if dealing with complex vector spaces or the corresponding ring when considering modules). One easily proves that the set of real valued linear applications as above form a vector space, $V^*$, the so-called dual space of $V$. If $\{e_i\}_{i=1,\ldots,n}$ is a basis of $V$, there is an associated basis $$\{e^{*i}\}_{i=1,\ldots,n}$$ of $V^*$, the dual basis, defined by the requirements (in addition to linearity): $$e^{*k}(e_i) = \delta^k_i\tag2$$ Therefore, a covariant vector $\omega \in V^*$ can alway decomposed as follows: $$\omega = \sum_k \omega_k e^{*k}$$ and, using linearity, (2), and $$v = \sum_i v^i e_i$$ one sees that $$\omega(v) = \sum_k \omega_k v^k\:.$$ The RHS doe not depend on the choice of the basis $\{e_i\}_{i=1,\ldots,n}$ and the corresponding $\{e^{*i}\}_{i=1,\ldots,n}$ even if components of covariant and contravariant vectors $\omega$ and $v$ depend on the considered bases. Obviously, changing the basis in $V$ and passing to $\{e'_i\}_{i=1,\ldots,n}$ related to $\{e_i\}_{i=1,\ldots,n}$ through (1), $\{e'_i\}_{i=1,\ldots,n}$ turns out to correspond to a dual basis $\{e'^{*i}\}_{i=1,\ldots,n}$. A straightforward computation based on (2) shows that $$e^{*i} = \sum_j {B_j}^i e'^{*j}$$ where $$B= \left(A^T\right)^{-1}\:.\tag3$$ Consequently, for a covariant vector $$\omega = \sum_i \omega_i e^{*i} = \sum_j \omega'_j e'^{*j}$$ where $$\omega'_j = \sum_j{B_j}^i \omega_i\:.\tag4$$ This relation, together with (3) is nothing but the standard rule for transforming components of a given covariant vector when one changes the decomposition basis.

This structure rarely appears dealing with classical physics, where one usually deals with orthonormal basis. The reason is that when changing basis and passing to another orthonormal basis, the matrix $A$ associating the bases is in the orthogonal group, so that: $$B= \left(A^T\right)^{-1} =A\:.\tag3$$ and one cannot distinguish, working in components, between covariant and contravariant vectors, since the former in (1) and (4) are, in fact, identical. For instance, for a fixed force $F$ applied to a point with velocity $v$, the linear map associating the force with its power as a function of $v$ defines a covariant vector that we could indicate by $"F\cdot"$ $$\pi^{(F)}: v \mapsto F\cdot v$$ where $\cdot$ denotes the standard scalar product in the Euclidean rest space of a reference frame.

If the (real finite dimensional!) vector space $V$ is equipped with a, generally, indefinite, scalar product, that is a non-degenerate symmetric bi-linear map $g : V \times V \to R$, a natural identification of $V$ and $V^*$ arises. It is nothing but the linear and bijective map associating contravariant vectors with covariant vectors: $$V \ni v \mapsto g(v, \:\:)\in V^*$$ Where, obviously $g(v, \:\:) : V \ni u \mapsto g(v, u)\in R$ turns out to be linear and thus define an element of $V^*$ as said. In components, if $u= \sum_i u^i e_i$ and $s= \sum_i s^i e_i$, one has in view of the bilinearity property fulfilled by $g$: $$g(u,s) = \sum_{i,j} g_{ij} u^is^j\qquad \mbox{where}\quad g_{ij} := g(e_i,e_j)\:.$$ The matrix of elements $g_{ij}$ is symmetric and non-singular (as $g$ is symmetric and non-degenerate). With this definition, one easily sees that, if $u\in V$ is a contravariant vector, the associated covariant one $g(u,\:\:)\in V^*$ has components: $$g(u, \:\:\:)_k= \sum_ig_{ki}u^i$$ so that, the scalar product $g(u,v)$ of $u$ and $v$ can also be written: $$g(u,v)= \sum_{ij} g_{ij}u^iv^j = \sum_i v_i u^i\:.$$

Finally, changing basis one has that: $$g(u,s) = \sum_{i,j} g'_{lm} u'^ls'^m\qquad \mbox{where}\quad g'_{lm} := g(e'_l,e'_m)\:,$$ and $$g'_{lm} = \sum_{ij}{B_l}^i {B_m}^j g_{il}\:.$$

[Physics] Examples of density operators $\rho=\sum\limits_n p_n|\phi_n\rangle\langle\phi_n|$ in which the states $\{|\phi_n\rangle\}$ are not orthogonal

This thread has seen a ton of incorrect statements coming from a number of sides, so it's probably a good idea to set the record straight in a bit more detail, and to provide some more examples of how expressions of this form come up in practice.

So, let's go through a brief rundown of some pertinent points.

The definition of a density matrix is just an operator $\rho:\mathcal H \to \mathcal H$ that is self-adjoint and positive semidefinite (and trace class if $\dim(\mathcal H)=\infty$), and whose trace satisfies $$\mathrm{Tr}(\rho)=1.$$ More importantly, this is all that's required by the definition. Any operator that satisfies those conditions can legitimately be called a density matrix, period.
Because of that, all operators that can be expressed in the OP's form, $$ \rho = \sum_n p_n |\phi_n \rangle\langle \phi_n|, \tag{$*$}$$ are valid density matrices so long as the component projectors are normalized to $\langle \phi_n|\phi_n \rangle =1$ and the weigths add up to $\sum_n p_n = 1$.
Those two requirements are the only actual requirements. None of the conditions for density-matrix-ness ($\rho^\dagger=\rho$, $\rho\geq 0$, and $\mathrm{Tr}(\rho)=1$) are impacted if the $|\phi_n\rangle$ are not pairwise orthogonal, or if their number exceeds the state space's dimension. That means that it's perfectly fine to take non-orthogonal states in a representation of the form $(*)$.
Explicit examples with non-orthogonal projectors are trivial to construct. Norbert Shuch's answer contains one example, but if you go looking for them you can build them instantly by just taking any collection of unit-normalized vectors weighted by unit-normalized weights $p_n$.

To provide one such example explicitly, consider the two-level space $\mathcal H = \mathbb C^2$, and a sequence of $N$ vectors lying equispaced along the equator of its Bloch sphere, giving $$ \rho = \sum_{n=0}^{N-1} p_n |\varphi_n\rangle\langle \varphi_n| \quad \text{for} \quad |\varphi_n\rangle = \frac{1}{\sqrt{2}} \bigg( |0\rangle + e^{i 2\pi n/N} |1\rangle\bigg). \tag{$\star$} $$ Here the weights can be arbitrary so long as $\sum_{n=0}^{N-1} p_n=1$; one obvious choice is $p_n = 1/N$ which gives the maximally-mixed state $\rho = \frac12 \mathbb I$, but there's plenty of other possible choices.
Representations of the form $(*)$ are not unique. Suppose, say, that you have some density matrix $\rho$ that you've managed to represent as a sum of normalized projectors in two different ways, say, $$ \rho = \sum_n p_n |\phi_n \rangle\langle \phi_n| = \sum_m q_m |\chi_m \rangle\langle \chi_m|, \tag{$**$}$$ where $\sum_n p_n = 1 = \sum_m q_m$ and $\langle \phi_n|\phi_n \rangle =1=\langle \chi_m|\chi_m \rangle$. Then there are some loose requirements on the two sets of vectors, starting with the fact that $\mathrm{span}\{|\phi_n\rangle\}$ needs to match $\mathrm{span}\{|\chi_m\rangle\}$, but in general, the layout of the $|\phi_n\rangle$ and the $|\chi_m\rangle$ within that span can be very different. This is evident in the example $(\star)$ above with equal weights, where $\rho$ is independent of the number $N$ of vectors in your collection, and it can also be represented as $\rho = \tfrac12 \left[ |0\rangle\langle 0| + |1\rangle\langle 1| \right]$.
Representations of the form $(*)$ are interpretations, and little more. There is some physical content in the statement $$ \rho = \sum_n p_n |\phi_n \rangle\langle \phi_n|, \tag{$*$}$$ namely, that you can produce the system state $\rho$ by producing the pure states $|\phi_n\rangle$ with probabilities $p_n$ and then forgetting which pure state you actually produced. However, the operative word there is "can": the fact that that procedure will produce $\rho$ does not say, at all, that it is the only possible procedure that will produce that state.
Representations do not imply that the vectors involved are eigenvectors of the resultant density matrix. That's true if the projectors are pairwise orthogonal, but that's not a requirement at all, so it is perfectly possible to construct $\rho$ as a sum of projectors that have nothing to do with the sum's eigenprojectors.

It's probably helpful to illustrate this with an explicit example, for clarity. Consider a two-level system that's prepared in a superposition of the form $$ |\theta_\pm\rangle = \cos(\theta/2)|0\rangle \pm \sin(\theta/2)|1\rangle,$$ i.e. an angle $\theta$ down from the north pole of the Bloch sphere, except that each time we flip a fair coin to see which sign of $\theta$ (i.e. which direction on the prime meridian) we take. Then the density matrix reads \begin{align} \rho & = \frac12 \bigg( |\theta_+\rangle\langle\theta_+| +|\theta_-\rangle\langle\theta_-| \bigg) \\ & = \frac12 \bigg( \big(\cos(\theta/2)|0\rangle + \sin(\theta/2)|1\rangle \big) \big(\cos(\theta/2)\langle 0| + \sin(\theta/2)\langle 1| \big) \\ & \qquad + \big(\cos(\theta/2)|0\rangle - \sin(\theta/2)|1\rangle \big) \big(\cos(\theta/2)\langle 0| - \sin(\theta/2)\langle 1| \big) \bigg) %\\ & = \frac12 \bigg( %\big(\cos^2(\theta/2)|0\rangle\langle 0| + \sin(\theta/2)\cos(\theta/2)|1\rangle %\langle 0| + \sin(\theta/2)\cos(\theta/2)|0\rangle \langle 1| + %\sin^2(\theta/2)|1\rangle\langle 1| \big) %\\ & \qquad + %\big(\cos^2(\theta/2)|0\rangle\langle 0| - \sin(\theta/2)\cos(\theta/2)|1\rangle %\langle 0| - \sin(\theta/2)\cos(\theta/2)|0\rangle \langle 1| + %\sin^2(\theta/2)|1\rangle\langle 1| \big) % \bigg) \\ & = \cos^2(\theta/2)|0\rangle\langle 0| + \sin^2(\theta/2)|1\rangle\langle 1| \end{align} because the off-diagonal terms cancel out. In this second representation, we do have orthogonal projectors, so here $|0\rangle$ and $|1\rangle$ are indeed the unique eigenvectors of $\rho$ (unless $\theta=\pi/2$ and $\rho$ is maximally mixed). But that doesn't stop our initial representation, $\rho = \frac12 \left( |\theta_+\rangle\langle\theta_+| +|\theta_-\rangle\langle\theta_-| \right)$, with its non-orthogonal, non-eigenvector components, from also being true.
If a state is built up using non-orthogonal projectors, then it also has a separate representation in terms of orthogonal projectors, and that's perfectly fine. Representations of the form $(*)$ are a dime a dozen if you know where to look. So, you found one that's not the canonical one: great! there's millions where that one came from.
Representations of the form $(*)$ really are a dime a dozen. If you want to build one yourself, say, for a two-level system, there's a few points that are particularly relevant to the recipe:
- The Pauli matrices are a basis for all valid density matrices, i.e. if $\rho=\rho^\dagger$ is traceless, then it can be represented as $$ \rho = \tfrac12 \mathbb I + \vec p \cdot \vec \sigma,$$ where $\vec p = (p_x,p_y,p_z)\in \mathbb R^3$ and $\vec \sigma =(\sigma_x, \sigma_y, \sigma_z)$ are the Pauli matrices. (Further, that relationship can be inverted via $\vec p = \mathrm{Tr}(\rho\vec\sigma)$.)
- The positivity condition $\rho\geq 0$ translates into the condition $||\vec p||\leq 1$, i.e. $\vec p$ lives inside the unit ball or its boundary $-$ generally known as the Bloch ball and the Bloch sphere in this context.
- If $|\vec p|=1$, i.e. $\vec p$ is on the Bloch sphere boundary, then $\rho = |\psi\rangle\langle\psi|$ is a pure state, and if you write $|\psi\rangle = \cos(\theta/2) |0\rangle + e^{i\varphi}\sin(\theta/2)|1\rangle$ (which you always can) then $\theta\in [0,\pi]$ and $\varphi\in[0,2\pi)$ are the polar and azimuthal spherical coordinates for $$ \vec p = (\sin(\theta)\cos(\varphi), \sin(\theta)\sin(\varphi), \cos(\theta).$$
- The relationship between $\vec p$ and $\rho$ is linear and bijective.
- If $\rho_1$ and $\rho_2$ are valid density matrices, then any convex combination $$ \rho = q_1 \rho_1 + q_2 \rho_2$$ of the two, with weights adding to $q_1+q_2=1$, is also a valid density matrix.
- Because the relationship between density matrices and Bloch-ball vectors is linear, any convex combination of density matrices translates directly into a convex combination of the corresponding Bloch-ball vectors. Thus, if $ \rho_1 = \tfrac12 \mathbb I + \vec p_1 \cdot \vec \sigma,$ $ \rho_2 = \tfrac12 \mathbb I + \vec p_2 \cdot \vec \sigma,$ and $ \rho = q_1 \rho_1 + q_2 \rho_2$, then $ \vec p= q_1 \vec p_1 + q_2 \vec p_2$ lies on the line that goes from $\vec p_1$ to $\vec p_2$, a fraction $q_1=1-q_2$ of the way in that direction.
So, what does this mean for density-matrix representations? If you have a target density matrix $\rho$ that you want to represent, simply take its Bloch-ball vector $\vec p = \mathrm {Tr}(\rho\vec\sigma)$, and then pick $N$ points $\vec p_n$ on the Bloch sphere itself (the boundary) and weights $q_n$ (normalized to $\sum_n q_n=1$) such that their average $\sum_n q_n \vec p_n=\vec p$ gives you your chosen point. That will then naturally give you a representation of your density matrix as a weighted sum of $N$ pure-state projectors, and you can read off the computational-basis components directly from the spherical coordinates of your chosen extremal points.

Best Answer

Related Solutions

[Physics] Understanding the difference between co- and contra-variant vectors

[Physics] Examples of density operators $\rho=\sum\limits_n p_n|\phi_n\rangle\langle\phi_n|$ in which the states $\{|\phi_n\rangle\}$ are not orthogonal

Related Question