Covariance – Random Variable and Sum of Random Variables

conditional probabilitycovariancepoisson distributionprobability

Consider a vector $\{v_1,…,v_n\}$, $v_k\in B$, where $B$ consists of $B_1,…,B_m$ disjoint subsets and let
\begin{align*}
L(v_k) = \begin{cases} 1 \quad \text{, if a specific event occurs for} \quad v_k \newline
0 \quad \text{, else} \end{cases}
\end{align*}

The probability that a specific event occurs for $v_k$ is $p_i\in(0,1)$. The probabilities $p_i$, $i\in\{1,…,m\}$, are exposed to random fluctuations $\vartheta_i$, $i\in\{1,…,m\}$. These fluctuations are realizations of random variables $\theta_i$, $i\in\{1,…,m\}$, with properties
\begin{align}
\mathbb{E}[\theta_i]=1 \quad \text{and} \quad Cov(\theta_i,\theta_j)=\begin{cases} \sigma^2 \quad \text{, if} \quad i=j \newline \rho\sigma^2 \quad \text{, if} \quad i\neq j \end{cases}
\end{align}

Further, let
\begin{align*}
S_i:=\sum_{v_k\in B_i} L(v_k) \quad \text{and} \quad S:=\sum_{l=1}^{m} S_l
\end{align*}

Let us make the following two assumptions

\begin{align}
S_i | (\theta_1,…,\theta_m) \sim Poi(n_i p_i \theta_i)\quad ,n_i \in \mathbb{N}\newline
Cov(S_i,S_j | (\theta_1,…,\theta_m)) = 0 \quad \text{, for} \quad i \neq j,
\end{align}

Question 1: How can I calculate $Cov(S_i,S)$?

Question 2: Following the calculation provided below by K. A. Buhr for $Cov(S_i,S)$, I now also tried to calculate $Var(S)$ by the law of total variance:
\begin{align}
Var(S)=\mathbb{E}[Var(S|\theta^*)]+Var(\mathbb{E}[S|\theta^*])
\end{align}

The first term is then just $\mathbb{E}[\sum_{l=1}^m n_l p_l \theta_l]=\sum_{l=1}^m n_l p_l$, assuming $\mathbb{E}[\theta_i]=1$.
The seond term becomes
\begin{align}
Var[\sum_{l=1}^m n_l p_l \theta_l]
=\sum_{l=1}^m Var(n_l p_l \theta_l)+\sum_{l\neq j} Cov[n_l p_l \theta_l,n_j p_j \theta_j]
=\sigma^2(\sum_{l=1}^m n_l^2 p_l^2 + \rho \sum_{l\neq j} n_l p_l n_j p_j)
\end{align}

Therefore,
\begin{align}
Var(S)=\sum_{l=1}^m n_l p_l+\sigma^2(\sum_{l=1}^m n_l^2 p_l^2 + \rho \sum_{l\neq j} n_l p_l n_j p_j)
\end{align}

Are my calculations for $Var(S)$ correct?

Best Answer

Note that you have Poisson rates $n_i p_i \theta_i \geq 0$ but $E(\theta_i)=0$. I'll assume you actually mean $E(\theta_i)=1$ (or equivalently, rates $n_i p_i (1+\theta_i)$ with $E(\theta_i)=0$).

Anyway, you should be able to apply the "law of total covariance": $$\mathrm{Cov}(X,Y)=E[\mathrm{Cov}(X,Y\mid Z)] + \mathrm{Cov}(E[X\mid Z], E[Y\mid Z])$$

Writing $\theta_\star:=(\theta_1,\dots,\theta_n)$, by the conditional (lack of) correlation and marginal distributions of the $S_i$, you have: $$\mathrm{Cov}(S_i,S\mid\theta_\star) = \mathrm{Var}(S_i\mid\theta_\star) = n_i p_i \theta_i$$ and so $E[\mathrm{Cov}(S_i,S\mid\theta_\star)] = n_i p_i$.

You also have $E[S_i\mid\theta_\star] = n_i p_i \theta_i$ and $E[S\mid\theta_\star] = \sum_j n_j p_j \theta_j$ giving: $$\mathrm{Cov}(E[S_i\mid\theta_\star],E[S\mid\theta_\star]) = \sum_j n_i n_j p_i p_j \mathrm{Cov}(\theta_i,\theta_j) = \sigma^2 n_i p_i \left( n_i p_i + \rho\sum_{j\not=i} n_j p_j \right)$$

The law of total covariance then gives: $$\mathrm{Cov}(S_i,S) = n_i p_i + \sigma^2 n_i p_i \left(n_i p_i + \rho \sum_{j\neq i} n_j p_j\right)$$

Related Question