[Math] Failure probability formula

pr.probabilityreference-request

In page 21 of A Problem seminar, D. J. Newman presents a novel way (at least for me) to determine the expectation of a discrete random variable. He refers to this expression as the failure probability formula. His formula goes like this

$f_{0}+f_{1}+f_{2}+\ldots$

where $f_{n}$ is the probability that the experiment fails to produce the desired outcome for $n$ steps.

He goes on to outlining two ways to establish the validity of this formula, but I'm having a hard time to unravel his second proof. I'll insert it here in order for my inquiry to be self-contained:

Since the expected amount of time spent on the $n$ trial is equal to the probability that the $n$ trial occurs, and since this equal to $f_{n-1}$, we do obtain the net expectation of $f_{0}+f_{1}+f_{2}+\ldots$ as asserted.

Can any of you guys explain in greater detail this Newmanian argument? I'd also like to know whether the denomination failure probability formula is a standard one in the mainstream of Probability literature.

Thank you very much for your continued support.

Best Answer

Well, this is certainly a known idea, but I suspect it's not important enough to have its own name. For example, I believe it is used without special mention in Hammersley's "A Few Seedlings of Research" (Proc. 6th Berkeley Symp. on Math. Stat. and Prob., 1971), which has as its target audience a graduate student just beginning research in mathematical statistics.

Here's one way of thinking about the argument: imagine the countably infinite sequence of experiments starting all lined up and ready to go; each experiment will run if the preceding one runs but is unsuccessful. The expected number of steps is the same as the expected number of these experiments that actually run. Thus, the expected number of steps is the sum over $n$ of the probability that the $n$th experiment runs. But this is precisely $f_{n - 1}$ (the probability that all the experiments before it fail).


An alternative perspective: let $p_n$ be the probability that it requires exactly $n$ steps. Then summing $$\begin{array}{ccccc} p_1 & & & & \\\\ p_2 & p_2 & & & \\\\ p_3 & p_3 & p_3 & & \\\\ p_4 & p_4 & p_4 & p_4 & \\\\ \vdots & \vdots & \vdots & \vdots & \ddots \end{array}$$ first by rows gives the "usual" expression for expected value while summing first by columns gives this expression you're dealing with. (If everything converges then it converges absolutely, so no need to worry about rearranging.)

Basically the same phenomenon is going on in the continuous version Michael Hardy mentions -- in that case we are calculating the area of one region in the $xy$-plane in two different ways, integrating with respect to $x$ in one and with respect to $y$ in the other.

Related Question