There are four types of coupons (call them types $1,2,3,4$) with $p_1 = p_2 = 1/8$ and $p_3 = p_4 = 3/8$. You collect coupons until you have a full set.
Q1: What is the probability that after disregarding duplicates, the order of coupons that you collect is $3241$? For example, let's say it takes you $10$ tries to get a full set and you obtain $3333324421$ which means the order of coupons collected was $3241$.
Q2: Given that the order you observe will be $3241$ (with possible duplicates), what is the expected number of coupons needed to collect a full set?
Q3: Let $X$ be the number of coupons needed to collect a full set. Is it true that $E[X] = E[X \mid 3241]P(3241) + E[X \mid 1234]P(1234) +…$ where you condition $X$ on all $4!$ possible ways that you will observe the order of new coupons that appear?
My attempt
Q1: $P(3241) = P(3)P(2 \mid 3)P(4 \mid 32)P(1 \mid 324) = \frac{3}{8}\frac{1}{5}\frac{3}{4}\frac{1}{1} = 0.05625$
Q2: $E[X \mid 3241] = 1+\frac{3+1}{1}+\frac{3+1+3}{3}+\frac{1+3+3+1}{1} = 15.33$ which is leveraging the EV of a geometric RV.
Q3: I believe yes… but when I put it in python I don't get the right answer. It could be because of my crappy scripting skills. Above in my attempt I solved for the case $3241$ and my output below matches it.
(1, 2, 3, 4) Condl EV: 7.33 | Prob: 0.0089 | EV*Prob: 0.06 | Cum Sum: 0.06
(1, 2, 4, 3) Condl EV: 7.33 | Prob: 0.0089 | EV*Prob: 0.06 | Cum Sum: 0.13
(1, 3, 2, 4) Condl EV: 10.0 | Prob: 0.0133 | EV*Prob: 0.13 | Cum Sum: 0.26
(1, 3, 4, 2) Condl EV: 12.6 | Prob: 0.0401 | EV*Prob: 0.50 | Cum Sum: 0.77
(1, 4, 2, 3) Condl EV: 10.0 | Prob: 0.0133 | EV*Prob: 0.13 | Cum Sum: 0.90
(1, 4, 3, 2) Condl EV: 12.6 | Prob: 0.0401 | EV*Prob: 0.50 | Cum Sum: 1.41
(2, 1, 3, 4) Condl EV: 7.33 | Prob: 0.0089 | EV*Prob: 0.06 | Cum Sum: 1.48
(2, 1, 4, 3) Condl EV: 7.33 | Prob: 0.0089 | EV*Prob: 0.06 | Cum Sum: 1.54
(2, 3, 1, 4) Condl EV: 10.0 | Prob: 0.0133 | EV*Prob: 0.13 | Cum Sum: 1.68
(2, 3, 4, 1) Condl EV: 12.6 | Prob: 0.0401 | EV*Prob: 0.50 | Cum Sum: 2.19
(2, 4, 1, 3) Condl EV: 10.0 | Prob: 0.0133 | EV*Prob: 0.13 | Cum Sum: 2.32
(2, 4, 3, 1) Condl EV: 12.6 | Prob: 0.0401 | EV*Prob: 0.50 | Cum Sum: 2.83
(3, 1, 2, 4) Condl EV: 12.6 | Prob: 0.0187 | EV*Prob: 0.23 | Cum Sum: 3.07
(3, 1, 4, 2) Condl EV: 15.3 | Prob: 0.0562 | EV*Prob: 0.86 | Cum Sum: 3.93
(3, 2, 1, 4) Condl EV: 12.6 | Prob: 0.0187 | EV*Prob: 0.23 | Cum Sum: 4.17
(3, 2, 4, 1) Condl EV: 15.3 | Prob: 0.0562 | EV*Prob: 0.86 | Cum Sum: 5.03
(3, 4, 1, 2) Condl EV: 18.0 | Prob: 0.1124 | EV*Prob: 2.02 | Cum Sum: 7.05
(3, 4, 2, 1) Condl EV: 18.0 | Prob: 0.1124 | EV*Prob: 2.02 | Cum Sum: 9.08
(4, 1, 2, 3) Condl EV: 12.6 | Prob: 0.0187 | EV*Prob: 0.23 | Cum Sum: 9.32
(4, 1, 3, 2) Condl EV: 15.3 | Prob: 0.0562 | EV*Prob: 0.86 | Cum Sum: 10.1
(4, 2, 1, 3) Condl EV: 12.6 | Prob: 0.0187 | EV*Prob: 0.23 | Cum Sum: 10.4
(4, 2, 3, 1) Condl EV: 15.3 | Prob: 0.0562 | EV*Prob: 0.86 | Cum Sum: 11.2
(4, 3, 1, 2) Condl EV: 18.0 | Prob: 0.1124 | EV*Prob: 2.02 | Cum Sum: 13.3
(4, 3, 2, 1) Condl EV: 18.0 | Prob: 0.1124 | EV*Prob: 2.02 | Cum Sum: 15.3
I get $15.3$ but the true answer is $437/35 \approx 12.48$.
Best Answer
Your calculation of the conditional expectations is wrong – though I must admit it seemed plausible to me at first and I had to write a simulation to pinpoint the mistake.
To see why it can't be right, consider three types of coupons with probabilities $p_1\gg p_2\gg p_3$ arriving in the order $231$. The way you're calculating the conditional expectation for the $3$ to appear, you're ignoring coupons of type $1$ and assuming that as long as you know that $3$ is next to appear, you can treat the process as if only types $2$ and $3$ existed, leading to the expectation $\frac{p_2+p_3}{p_3}$. But that's a large number, whereas if $3$ comes before $1$ it's most likely to come right away, so the expectation should be close to $1$.
Once $2$ has appeared, the expected time to see the next new type is $\frac1{1-p_2}=\frac1{p_1+p_3}$, and this is independent of whether that next new type happens to be $1$ or $3$: If you calculate the conditional expectation the pedestrian way as
$$ \frac{\sum_nnp_2^np_3}{\sum_np_2^np_3}\;, $$
$p_3$ drops out. So
$$ \mathsf E[X\mid3241]=1+\frac85+2+8=12.6\;, $$
and if you do them all like that, the answer comes out right.