Poisson Process – Joint Distribution and estimation of $p_i$

poisson processprobabilitystatisticsstochastic-processes

Two copy editors read a $300$-page manuscript. The first found $100$ typos, the second found $120$, and their lists contain $80$ errors in common. Suppose that the author's typos follow a Poisson process with some unknown rate $\lambda$ per page, while the two copy editors catch errors with unknown probabilities of success $p_1$ and $p_2$. Let $X_0$ be the number of typos that neither found. Let $X_1$ and $X_2$ be the number of typos found only by $1$ or only by $2$, and let $X_3$ be the number of typos found by both.

($a$) Find the joint distribution of ($X_0;X_1;X_2;X_3$), expressed in $\lambda$, $p_1, p_2$.

($b$) Use the answer to ($a$) to find an estimates of $p_1, p_2$ and then the number of undiscovered typos. (Hint: Let $N(s)$ be the Poisson process, with Ti being the "time" when the i-th typo occurs, and the "time" variable is the page number.)

Here's the thing…

Well, I've got that ($a$) is, given that each error is found by neither editor, $1$ only, $2$ only, and both editors with probabilities $(1-p_1)(1-p_2),p_1(1-p_2),(1-p_1)p_2$ and $p_1p_2$, respectively, and assuming that $x_0+x_1+x_2+x_3=x$,

$\Pr[(X_0,X_1,X_2,X_3)=(x_0,x_1,x_2,x_3)]=\Pr[N=x]f(x_0,x_1,x_2,x_3)$

where $\Pr[N=x]=(300\lambda)^x\exp(-300\lambda)/x!$

and

$f(x_0,x_1,x_2,x_3) ={x\choose x_0,x_1,x_2,x_3}p_1^{x_1+x_3}p_2^{x_2+x_3}(1-p_1)^{x_0+x_2}(1-p_2)^{x_0+x_1}$

is the mass function of the multinomial distribution.

But, i'm not sure how to obtain the values for (b). Can someone help me?
Searching, i've got the following answer:

For (b), we are given that $X_1=20,X_2=40,X_3=80$. It is reasonable to guess that $p_1p_2=2(1-p_1)p_2=4p_1(1-p_2)$, which gives $p_2=4/5$, $p_1=2/3$. Then, $(1−p_1)(1−p_2)=1/15$, so a reasonable estimate for $X_0$ is $80/8=10$.

But why?

Best Answer

Another way to obtain your joint density is to recognize that we're dealing with a split Poisson process with independent $X_0,X_1,X_2,X_3$ having distributions $$X_0 \sim \text{Poisson}\Big(300\lambda(1-p_1)(1-p_2)\Big)$$ $$X_1 \sim \text{Poisson}\Big(300\lambda p_1(1-p_2)\Big)$$ $$X_2 \sim \text{Poisson}\Big(300\lambda(1-p_1)p_2\Big)$$ $$X_3 \sim \text{Poisson}\Big(300\lambda p_1 p_2\Big)$$ So the joint pdf of $(X_1,X_2,X_3,X_4)$ which we'll denote by $p_{X_1X_2X_3X_4}$ can be established using independence and without needing to condition on $X_1+X_2+X_3+X_4$: $$p_{X_1X_2X_3X_4}(x_1,x_2,x_3,x_4)=P(X_0=x_0)P(X_1=x_1)P(X_2=x_2)P(X_3=x_3)$$ Next, observe the joint pdf of $(X_1,X_2,X_3)$ simply equals $$p_{X_1X_2X_3}(x_1,x_2,x_3)=P(X_1=x_1)P(X_2=x_2)P(X=x_3)$$ and so $$p_{X_1X_2X_3}(20,40,80)\propto \lambda^{140}p_1^{100}p_2^{120}(1-p_1)^{40}(1-p_2)^{20}e^{-300\lambda(p_1-p_1p_2+p_2)}$$ A natural way to estimate $p_1,p_2,$ and $\lambda$ is to find such parameters which maximizes $p_{X_1X_2X_3}(20,40,80)$ on the domain $$(\lambda,p_1,p_2)\in [0,\infty) \times [0,1]\times [0,1]$$ We find that $p_{X_1X_2X_3}(20,40,80)$ is maximized whenever $$(\lambda,p_1,p_2)=\Big(\frac{1}{2},\frac{2}{3},\frac{4}{5}\Big)$$ I think this is how the problem is intended to be solved as it makes explicit use of the joint density. The technique you and @BruceET utilize to solve this problem is the method of moments which requires that you solve the system of equations $$E(X_1)=20$$ $$E(X_2)=40$$ $$E(X_3)=80$$ This becomes $$300\lambda p_1 (1-p_2)=20$$ $$300 \lambda (1-p_1)p_2=40$$ $$300 \lambda p_1 p_2=80$$ The solution to this system is $(\lambda,p_1,p_2)=\Big(\frac{1}{2},\frac{2}{3},\frac{4}{5}\Big)$ and provides us with the same estimate we obtained by using the joint density explicitly.