Dirac’s definition of probability in quantum mechanics

hilbert-spaceobservablesoperatorsprobabilityquantum mechanics

I'm currently reading "The principles of quantum mechanics" by Dirac, and I'm having some trouble understanding some of his assumptions, because in the quantum mechanics course I'm following at university we are taking a different approach (a more modern one I guess).

I got to the point where Dirac explains why he decides to define the average value of the observable $\xi$ in the state $|x\rangle$ as $\langle x|\xi|x\rangle$.
What I don't get is the following part, where he writes that the probability $P_a$ of $\xi$ having the value $a$ is given by the average value of the Kronecker delta $\delta_{\xi a}$:

$$P_a=\langle x|\delta_{\xi a}|x\rangle$$

I don't understand if he's deciding to define the probability of $\xi$ having the value $a$ in this way, or if he's deducing this formula.

In the following similar question: Why does the probability of obtaining a value of a measurement follow from Dirac's general assumption? By Symmetry writes:

The Projection Operator $\delta_{\varepsilon\, a}$ is $1$ for some particular $|a\rangle$ and is $0$ on any orthogonal state. The definition of the expectation in classical probability theory for this operator is given by \begin{equation} \left\langle \delta_{\varepsilon\, a}\right\rangle = 1\Pr(|x\rangle = |a\rangle) + 0\Pr(|x\rangle \perp |a\rangle) = \Pr(|x\rangle = |a\rangle)\end{equation}

So, from this answer it seems to me that he's saying that $P_a=\langle x|\delta_{\xi a}|x\rangle$ by definition, but from Dirac's book I have the impression that he's deducing it from previous assumptions.

So my question is: is the formula for the probability of $\xi$ having value $a$ in the state $|x\rangle$ just a definition or can it be deduced by previous assumptions in Dirac's book? If he's deducing it, how is he doing it?

Best Answer

I'm currently reading "The principles of quantum mechanics" by Dirac...

...What I don't get is the following part, where he writes:

In the general case we cannot speak of an observable having a value for a particular state, but we can speak of its having an average value for the state. We can go further and speak of the probability of its having any specified value for the state, meaning the probability of this specified value being obtained when one makes a measurement of the observable. This probability can be obtained from the general assumption in the following way. Let the observable be $\xi$ and let the state correspond to the normalized ket $|x\rangle$. Then the general assumption tells us, not only that the average value of $\xi$ is $\langle x|\xi|x\gt$, but also that the average value of any function of $\xi$, $f(\xi)$ say, is $\lt x|f(\xi)|x\rangle$. Take $f(\xi)$ to be that function of $\xi$ which is equal to unity when $\xi =a$, $a$ being some real number, and zero otherwise. This function of $\xi$ has a meaning according to our general theory of functions of an observable, and it may be denoted by $\delta_{\xi a}$ in conformity with the general notation of the symbol $\delta$ with two suffixes given on p. 62 {equation (17)). The average value of this function of $\xi$ is just the probability, $P_a$ say, of $\xi$ having the value $a$. Thus $P_a=\langle x|\delta_{\xi a}|x\rangle$. If $a$ is not an eigenvalue of $\xi$, $\delta_{\xi a}$ multiplied into any eigenket of $\xi$ is zero, and hence $\delta_{\xi a}$ and $P_a$. This agrees with a conclusion of §10, that any result of a measurement of an observable must be one of its eigenvalues.

Based on his Section 16 at Equation 16, Dirac is using the $\delta$ symbol with two suffixes (like $\delta_{ab}$ or $\delta_{\xi a}$) to denote a Kronecker delta, not a Dirac delta.

Next, note that Dirac denotes "the normalized ket" by the symbol $|x\rangle$. He seem to sometimes use $|x\rangle$ in his text for kets with a continuous label, but he also refers to this ket as "the normalized ket," so in this case he may be using the notation differently.

Anyways, instead of calling the normalized ket "$|x\rangle$," I will call it "$|\Psi\rangle$."

And, instead of calling the observable "$\xi$," I will call it "$\hat A$."

Let $\hat A$ have eigenvectors $\{|a\rangle\}$.

Let the expansion of $|\Psi\rangle$ in terms of the $|a\rangle$ be: $$ |\Psi\rangle = \sum_{a}\Psi_a |a\rangle\;. $$

The expectation value of $\hat A$ in the state $|\Psi\rangle$ is: $$ \langle\Psi|\hat A|\Psi\rangle = \sum_{a,b}\Psi_a\Psi_b^*\langle b|\hat A|a\rangle = \sum_{a}|\Psi_a|^2 a $$

The expectation value of $f(\hat A)$ in the state $|\Psi\rangle$ is: $$ \langle\Psi|f(\hat A)|\Psi\rangle =\sum_{a,b}\Psi_a\Psi_b^*\langle b|f(\hat A)|a\rangle = \sum_{a}|\Psi_a|^2 f(a) $$

The expectation value of a function: $$ f_{a_0}(a) \equiv \delta_{aa_0} = \left\{ \begin{matrix} \;1 \qquad a=a_0\\ 0 \qquad \text{else}\end{matrix}\right. $$ is $$ \sum_a |\Psi_a|^2 \delta_{aa_0} = |\Psi_{a_0}|^2\;, $$ which, as usual, is the probability to measure $a_0$ when the state is $|\Psi\rangle$.

If the $a$ are continuous, $|\Psi(a)|^2$ is a probability density rather than a probability.


Update:

To spell out the above explanation again, maybe a bit more clearly:

  1. Dirac introduces the expectation value of $\hat A$, which is written as $\langle \Psi|\hat A|\Psi\rangle$. He does not perform the expansion in terms of eigenstates of $\hat A$, but we did so above and we see that $\langle \Psi|\hat A|\Psi\rangle = \sum_{a}|\Psi_a|^2 a$.
  2. [Dirac may have noted it previously, but I note here that $0\le|\Psi_a|^2\le 1$ and $\sum_a|\Psi_a|^2=1$, which are also properties that we desire of a set of probability values.]
  3. Dirac further notes that the expectation value of any function $f(\hat A)$ is $\langle \Psi|f(\hat A)|\Psi\rangle$, which, again, he does not expand, but we expanded to show that it is $\sum_a|\Psi_a|^2 f(a)$.
  4. Dirac considers a special function $f_{a_0}(a)$ that is $0$ for all the $a$ values except for one of them $a_0$ (where it is $1$) and he asserts (correctly, see below) that the expectation value of such a function is the probability that $\hat A$ takes on the value $a_0$. He does not show this explicitly, but we have shown explicitly above that $\langle \Psi|f_{a_0}(\hat A)|\Psi\rangle = |\Psi_{a_0}|^2$.

Regarding why the expectation value of a function $f_{a_0}(a)$ that is zero for all $a$ other than $a_0$ and is $1$ at $a_0$ is the probability of $a_0$:

Consider the general definition of an expectation value (not specific to quantum mechanics, but completely general) for a given discrete set of probability values $P(a)$: $$ \langle f \rangle = \sum_a P(a) f(a)\;. $$ Clearly $$ \langle f_{a_0}\rangle = P(a_0)\;. $$

Related Question