Solved – the variance of the difference of two random-variable indicators with a chance of intersection between events

indicator functionprobabilityset-theoryvariance

Let $X$ be an event whose probability P($X$) = $p$ and let $Y$ be an event whose probability is P($Y$) = $q$. The probabilit$Y$ of $X$ intersection with $Y$, $P(X \cap Y)$ = $r$.

$I_X$ is the indicator of the event $X$, i.e., when event $X$ happens, $I_X$ is equal to 1, otherwise, it's 0. Similarly, $I_Y$ is the indicator of $Y$ and it behaves in the same way $I_{X}$ does.

What is the variance of $(I_{X}- I_{Y})$?

My attempt so far is to take $(I_X – I_{Y})$ as $R$ and proceed with $var(R) = E(R^2) – (E(R))^2$ , where $E(.)$ is the expected value operator.

First, I'd calculate $E(R)$ as

$E(R) = E(I_X – I_{y} ) = E(I_{x} ) – E(I_{y} ) = (p – q)$.

Thus, $(E(R))^2$ is $(p – q)^2$.

Should I somehow account for a possible intersection here? I know the union of events M and N is calculated as:

$P(M \cup N) = P(M) + P(N) – P(M \cap N)$

But I don't know how to account for it in the difference of the indicator variables.

Second,

$E(R^2) = E(I_{X}^2 – 2.I_{x} .I_Y + I_{Y}^2) = E(I_{X}^2) – 2.E(I_{X} .I_{Y}) + E(I_{Y}^2)$.

$I_{X}^2$ happens over $\{0,1\}^2 = \{0,1\}$, hence, the probability ratio between the possible outcomes is preserved. Thus, $E(I_{X}^2) = E(I_{X}) = p$. Is this assumption correct? If it is, consequently, $E(I_{Y}^2) = E(I_{Y}) = q$.

I could then use these results and plug them into $var(R) = E(R^2) – (E(R))^2$ but how would will I deal with the possible intersection?

I guess that's the part I feel unsure of: how does one account for a possible intersection?

In the way the problem is stated, I'd say they're independent events, ergo,

$E(I_{X} .I_{Y} ) = E(I_{X} ) . E(I_{Y} ) = r$.

But I don't know if such an assumption would be correct.

Best Answer

You don't need fancy notation at all. $I_X-I_Y=R$ is a random variable that takes on value $1$ if event $X$ occurs but event $Y$ does not (probability $p-r$), value $-1$ if event $Y$ occurs but $X$ does not(probability $q-r$, and value $0$ otherwise (when either both events occur or both do not) whose probability I could calculate but won't since the value is irrelevant. So, $$E[I_X-I_Y] = E[R] = 1\cdot (p-r) + (-1)\cdot (q-r) + 0\cdot (\text{irrelevant})$$ and $$E[(I_X-I_Y)^2] = E[R^2] = (1)^2\cdot (p-r) + (-1)^2\cdot (q-r) + 0\cdot (\text{irrelevant}).$$ Can you take it from here?

Related Question