Summation – Proving a Harmonic Number Identity

binomial-coefficientscoupon-collectorharmonic-numbersinclusion-exclusionsummation

I've been trying to prove

$$\sum_{k=1}^{n}{(-1)^{k+1} {{n}\choose{k}}\frac{1}{k}=H_n}$$

I've tried perturbation and inversion but still nothing. I've even tried expanding the sum to try and find some pattern that could relate this to the harmonic series but this just seems surreal. I can't find any logical reasoning behind this identity. I don't understand Wikipedia's proof. Is there any proof that doesn't require the integral transform?

Best Answer

In the coupon collector's problem, the expected value of the number $N$ of coupons required for a full collection of $m$ coupon types can be determined in two different ways. In the standard treatment, the expected numbers of coupons to get each new type are added up, yielding the result $mH_m$. But we can also determine this expectation more mundanely from the distribution:

\begin{align} \def\stir#1#2{\left\{{#1\atop #2}\right\}} E[N]={}&\sum_{n=0}^\infty P(N\gt n)\\ ={}&\sum_{n=0}^\infty\left(1-\frac{m!}{m^n}\stir nm\right)\\ ={}&\sum_{n=0}^\infty\left(1-\frac{m!}{m^n}\frac1{m!}\sum_{j=0}^m(-1)^{m-j}\binom mjj^n\right)\\ ={}&\sum_{n=0}^\infty\frac1{m^n}\sum_{j=0}^{m-1}(-1)^{m-j+1}\binom mjj^n\\ ={}&\sum_{j=0}^{m-1}(-1)^{m-j+1}\binom mj\sum_{n=0}^\infty\frac{j^n}{m^n}\\ ={}&\sum_{j=1}^m(-1)^{j+1}\binom mj\sum_{n=0}^\infty\frac{(m-j)^n}{m^n}\\ ={}&\sum_{j=1}^m(-1)^{j+1}\binom mj\frac mj\\ ={}&m\sum_{j=1}^m(-1)^{j+1}\binom mj\frac1j\;. \end{align}

(Here $\stir nm$ is a Stirling number of the second kind.) Equating the two results yields our identity.

Perhaps the derivation appears slightly less opaque if we avoid the detour through the Stirling numbers and directly apply inclusion-exclusion. There are $\binom mj$ ways to select $j$ coupon types that haven't been collected yet, and the probability that $j$ particular coupon types haven't been collected after $n$ draws is $\left(1-\frac jm\right)^n$. Thus

\begin{align} E[N]={}&\sum_{n=0}^\infty P(N\gt n)\\ ={}&\sum_{n=0}^\infty\sum_{j=1}^m(-1)^{j+1}\binom mj\left(1-\frac jm\right)^n\\ ={}&\sum_{j=1}^m(-1)^{j+1}\binom mj\sum_{n=0}^\infty\left(1-\frac jm\right)^n\\ ={}&\sum_{j=1}^m(-1)^{j+1}\binom mj\frac mj\\ ={}&m\sum_{j=1}^m(-1)^{j+1}\binom mj\frac1j\;. \end{align}

We can generalise this to obtain an expression for $H_m-H_k$. The standard treatment shows that the expected value of the number $N_k$ of coupons required until only $k$ coupon types are missing is $m(H_m-H_k)$. Applying inclusion-exclusion in the form of corollary $(7)$ in this answer, this expected value can also be calculated as

\begin{align} E[N_k] &= \sum_{n=0}^\infty P(N_k\gt n) \\ &= \sum_{n=0}^\infty\sum_{j=1}^m(-1)^{j+k+1}\binom{j-1}{k}\binom mj\left(1-\frac jm\right)^n \\ &= \sum_{j=1}^m(-1)^{j+k+1}\binom{j-1}{k}\binom mj\sum_{n=0}^\infty\left(1-\frac jm\right)^n \\ &= \sum_{j=1}^m(-1)^{j+k+1}\binom{j-1}{k}\binom mj\frac mj \\ &= m\sum_{j=1}^m(-1)^{j+k+1}\binom{j-1}{k}\binom mj\frac1j\;, \end{align}

yielding

$$ H_m-H_k=\sum_{j=1}^m(-1)^{j+k+1}\binom{j-1}{k}\binom mj\frac1j\;. $$

Related Question