The distribution of $X\vert Y$ where $Y$ is Bernoulli R.V

conditional probabilityprobabilityprobability distributionsprobability theory

On good days customers arrive at an infinite server queue
according to a Poission process with rate $12$ per hour, whereas on other days
they arrive according to a Poisson process with rate $4$ per hour. The service times,
on all days, are exponentially distributed with rate $1$ per hour. Every day at time $10$
hours the system is shut down and all those presently in service are forced to leave
without completing service. Suppose that each day is, independently, a good day
with probability $0.5$ and that we want to use simulation to estimate $\theta$, the mean
number of customers per day that do not have their services completed.

My question regards only in finding $\mathbb{E}[X|Y]$ where $X$ denotes the number of costumers that do not have their services completed on any day and $Y$ denotes whether the day is good or ordinary. Therefore $Y$ is Bernoulli. (The hint given is that $X|Y$ follows a Poisson).

I was not able to find an explicit form for the distribution of $X$ given $Y$, but my reasoning in finding an approximate value of such goes as follows:

The average number of people that arrive per hour on a good day is $12$ and they stay in the system approximately $1$ hour each. On average $108$ people visit the server. Suppose now that we allow people in the server at time $10$, as soon as they walk in the server closes. But the average of people that walk is at $t=10$ is $12$, therefore
$$\mathbb{E}\left[X|Y=0\right]\approx12$$
and by the same reasoning
$$\mathbb{E}\left[X|Y=1\right]\approx4$$
where $Y=0$ denotes that the day is gonna be good with probability $1/2$ and $Y=1$ denotes that it is going to be bad with probability $1/2$.
My question is, how would I find the exact value of such expectations? Is it possible without any prior knowledge of stochastic processes?


UDATE:

After some research I found that
$$\mathbb{E}\left[X\vert Y=0\right]=12\left(1-e^{-10}\right)$$
and

$$\mathbb{E}\left[X\vert Y=1\right]=4\left(1-e^{-10}\right)$$
Which indeed agrees with my approximation. I am wondering how this answers was derived, as the source I found it on does not give any motivation.

Best Answer

This is a classical textbook exercise on Poisson process.

Denote $N(t)$ be the total number of customers arrived before time $t$. From the given information, conditional on $Y = y$, $N(t)$ is just an ordinary Poission process, i.e. $N(t)|Y = y \sim \text{Poisson}(\lambda_y t), y = 0, 1$.

Denote $W_n$ be the arrival time and $V_n$ be the corresponding service time of the $n$th customer. It is well known that $W_n|Y = y$ has a Gamma / Erlang distribution. $V_n$ is given to have an exponential distribution.

Note that the $n$th customer is having service at time $t$ if and only if $W_n < t$ (arrive before time $t$) and $W_n + V_n > t$ (leave after time $t$). $X$ can be viewed as the total number of customer having service at time $t$, and thus can be decomposed as a sum of indicators as follow:

$$ X = \sum_{k=1}^{+\infty} \mathbf{1}\{W_k < t, W_k + V_k > t\} \tag{*}$$

Consider $$ \begin{align*} &E[X|Y = y] \\ &= E\left[\sum_{k=1}^{+\infty} \mathbf{1}\{W_k < t, W_k + V_k > t\} \Bigg|Y = y\right] \tag{1} \\ &= \sum_{n=0}^{+\infty} E\left[\sum_{k=1}^{+\infty} \mathbf{1}\{W_k < t, W_k + V_k > t\} \Bigg|N(t) = n, Y = y\right]\Pr\{N(t) = n|Y = y\} \tag{2}\\ &= \sum_{n=1}^{+\infty} E\left[\sum_{k=1}^{n} \mathbf{1}\{W_k + V_k > t\} \Bigg|N(t) = n, Y = y\right] e^{-\lambda_y t} \frac {(\lambda_y t)^n} {n!} \tag{3} \\ &= \sum_{n=1}^{+\infty} E\left[\sum_{k=1}^{n} \mathbf{1}\{U_{(k)} + V_k > t\} \right] e^{-\lambda_y t} \frac {(\lambda_y t)^n} {n!} \tag{4} \\ &= \sum_{n=1}^{+\infty} e^{-\lambda_y t} \frac {(\lambda_y t)^n} {n!} \sum_{k=1}^{n} E\left[\mathbf{1}\{U_{k} + V_k > t\} \right] \tag{5} \\ &= \sum_{n=1}^{+\infty} e^{-\lambda_y t} \frac {(\lambda_y t)^n} {n!} n\Pr\{U_1 + V_1 > t\} \tag{6} \\ &= \sum_{n=1}^{+\infty} e^{-\lambda_y t} \frac {(\lambda_y t)^n} {(n-1)!} \int_0^t \frac {1} {t} \Pr\{V_1 > t - u\}du \tag{7} \\ &= \sum_{n=1}^{+\infty} e^{-\lambda_y t} \frac {\lambda_y^n t^{n-1}} {(n-1)!} \int_0^t e^{-(t-u)}du \tag{8} \\ &= \lambda_y \sum_{n=0}^{+\infty} e^{-\lambda_y t} \frac {\lambda_y^n t^n} {n!} (1 - e^{-t}) \tag{9} \\ &= \lambda_y(1 - e^{-t}) \tag{10} \end{align*} $$

where

$(1)$ is from the decomposition $(*)$

$(2)$ is law of total probability

$(3)$ we can drop the $N(t) = 0$ case in which $X = 0$; And $N(t) = n$ if and only if $W_1, W_2, \ldots, W_n \leq t$ and $W_{n+1}, W_{n+2}, \ldots > t$. so the sum inside the expectation is simplified. The last part is just Poisson pmf

$(4)$ is a crucial step: Conditional on $N(t) = n$, $W_k$ has an ordered uniform distribution on $(0, t)$, and denote by $U_{(k)}$, and it is independent of the rate $\lambda_y$

$(5)$ is another crucial step: Since the summand is a symmetric functional of those ordered statistics, it has the identical distribution with the unordered, original one, i.e. $U_k \sim \text{Uniform}(0, t)$

$(6)$ is due to the identical distribution of those $U_k + V_k$, and the expectation of indicator is just a probability

$(7)$ is a continuous version of law of total probability

$(8)$ is from the exponential CDF

$(9)$ is shifting index and computing the integral

$(10)$ is just summing the pmf to get 1