Variance using indicator variables

combinatoricsexpected valuegraph theoryprobabilityvariance

A graph $G=(V, E)$ has $2n$ vertices of which $n$ are coloured in blue and $n$ are coloured in red. The probability that there is an edge between any two vertices is $1/2$. Let $Y$ denote the expected number of edges whose endpoints have the same colour. Compute the variance of $Y$.

I should also compute the variance of the number of edges whose endpoints have the same colour, whereby I will denote this random variable as $Y$ now (in the above it was $X$). Here is my approach:

\begin{align*}
Y_{u, v}:= \begin{cases}
1, \text{ the edge } \{u, v\} \text{ exists} \land \left(
u, v \text{ are blue} \lor u, v\text{ are red}
\right)
\\[5pt]
0, \ \text{else}
\end{cases}
.\end{align*}

So far I know that $\mathbb{E}[Y_{u, v}]=$
We will use the identity
$$
\text{Var}\left[X \right] =\mathbb{E}\left[X^2 \right]-
\mathbb{E}\left[X \right]^2.
$$

\begin{align*}
\mathbb{E}\left[Y^2 \right]
= \mathbb{E}\left[
\sum_{\{u, v\} \subseteq V}^{} \sum_{\{x, y\} \subseteq V}^{}
Y_{\{u, v\} }\cdot Y_{\{x, y\} }
\right]
=\sum_{\{u, v\} \subseteq V}^{} \sum_{\{x, y\} \subseteq V}^{}
\mathbb{E}\left[Y_{\{u, v\} }\cdot Y_{\{x, y\} }\right]
.\end{align*}

Now we do a case distinction:
\begin{align*}
&1.\quad \{u, v\} =\{x, y\}\colon
\mathbb{E}\left[Y_{\{u, v\} }\cdot Y_{\{x, y\} }\right]
=
\mathbb{E}\left[Y_{\{u, v\} }^2\right]=
\mathbb{E}\left[Y_{\{u, v\} }\right]=?.
\\[10pt]
&2. \quad \{u, v\} \cap \{x, y\} =1\colon ?
\\[10pt]
&3. \quad \{u, v\} \cap \{x, y\} = \varnothing \colon
\mathbb{E}\left[Y_{\{u, v\} }\cdot Y_{\{x, y\} }\right]
=\mathbb{E}\left[ Y_{\{u, v\} } \right]\cdot \mathbb{E}\left[Y_{\{x, y\} }
\right]=?.
\end{align*}

I was told as a hint to define a suitable indicator variable and then use the above method to split it up, however, I'm stuck at the calculation. First, is my choice of indicator variable correct? If yes, how do I compute $\mathbb{E}\left[Y_{\{u, v\} }\right]$ and deal with the second case?

Edit: I'm not allowed to make use of the binomial distribution.

Best Answer

Since there are $n(n-1) = 2 \binom n2$ potential edges between two vertices of the same color, and each one exists independently with probability $\frac12$, $Y \sim \text{Binomial}(n(n-1), \frac12)$, and you can just use the formula for the variance of a binomial.

As for your choice of indicator variables: it will lead you to the correct answer if you do everything correctly, but it makes everything too difficult. You have $n^2$ indicator variables that are identically $0$: the $Y_{\{u,v\}}$ where $u$ is red and $v$ is blue. With the indicator variable approach, it is better to define $V = R \cup B$ as the partition of the vertex set into colors and take $$ Y = \sum_{\{u,v\} \subseteq R} Y_{\{u,v\}} + \sum_{\{u,v\} \subseteq B} Y_{\{u,v\}}. $$ All these indicator variables are independent and have expected value $\frac12$, so if you are not allowed to use the binomial distribution, you will end up re-deriving its variance.