I’ve clicked XKCD’s “random” button k times and I’ve already seen all of them. What’s the expected number of XKCD’s I’ve seen

balls-in-binsbayesiancoupon-collectorexpected valueprobability

This seems like a modification of the coupon collector's problem which can be stated as follows:

There are $n$ coupons total to collect. Given that the past $k$ coupons seen I've already collected (coupons are collected with replacement), what's the expected number of coupon's I've collected so far?

I'm also unsure if this problem depends on an underlying prior probability distribution on the number of coupons I've collected; if it does, can this be solved with an arbitrary distribution?

One additional part to this question: let's say after the $k$th one I collect a new coupon; what would the expected number of coupon's I've collected be then?

Best Answer

If you've seen $n$ comics out of $N$, the probability that $k$ consecutive comics are all ones you've seen is $(\frac nN)^k$, which is proportional to $n^k$ (the $N^k$ denominator is constant). So if the prior odds are uniform, the posterior odds that you've seen $1, 2, \dots, N$ comics are $$ 1^k : 2^k : 3^k : \cdots : N^k. $$ So the conditional probability that you've seen $n$ comics out of $N$ is $\frac{n^k}{1^k + 2^k + \cdots + N^k}$, and we get an expected value of $$ \frac{1 \cdot 1^k + 2 \cdot 2^k + \dots + N \cdot N^k}{1^k + 2^k + \cdots + N^k}. $$ This doesn't simplify particularly well, but when $N$ is large compared to $k$, we may approximate the sum by an integral; the numerator is approximately $ \int_0^N x^{k+1}\,dx = \frac{N^{k+2}}{k+2}$ and the denominator is approximately $\int_0^N x^k \,dx = \frac{N^{k+1}}{k+1}$.

So the expected total number of comics you've read, given that you've sampled $k$ comics and you've read all of them, is approximately $\frac{k+1}{k+2} N$.