How to understand mixture distribution

probability distributionsstatistics

Let $f_1(x)$ and $f_2(x)$ be density functions, and let a be a constant such that $0 \leq a \leq 1$.
Consider the function $f(x) = af_1(x) + (1 āˆ’ a)f_2(x)$.

suppose $X_i \sim f_i, E(X_i) = \mu_i, Var(X_i) = \sigma^2_i, X$ is a mixture of $X_1, X_2$, find Var(X)

$E(X^2) = \int x^2 f(x)dx$ Why is it okay to express it as $\int x^2(af_1(x)+(1-a)f_2(x))dx = aE(X_1^2)+(1-a)E(X_2^2)$?

Shouldn't it be consist of both $x_1$ and $x_2$?

My understanding is that $E(X^2)$ should be $E((aX_1+(1-a)X_2)^2)=a^2E(X_1^2)+(1-a)^2E(X_2^2)+2a(1-a)E(X_1X_2)$

Best Answer

I think that you are confusing two separate things. When we say that $X$ is a mixture of $X_1$ and $X_2$, with respective probability density functions $f_1$ and $f_2$, we mean that there exists constants $\lambda_1,\lambda_2\ge0$ such that $\lambda_1 + \lambda_2 = 1$ and $$f(x) = \lambda_1f_1(x) + \lambda_2f_2(x) $$ Where $f$ is the pdf of $X$.

In your question, you have $\lambda_1 := a$ and $\lambda_2 := 1-a$, thus the pdf of $X$ is $f = af_1 + (1 āˆ’ a)f_2$. To compute $\mathbb E[X^2]$, you can then proceed as usual with the law of the unconscious statistician : $$\mathbb E[X^2] = \int x^2f(x)\, dx = \int x^2(af_1(x)+(1-a)f_2(x))\,dx $$

If you considered the random variable $\tilde X := aX_1 + (1-a)X_2$ instead, you would not have in general that the pdf of $\tilde X$ is $af_1 + (1-a)f_2$. Consider for instance $X_1$ and $X_2$ to be independent standard normal r.v.s, and $a = 1/2$ :
You have that $f = af_1 + (1-a)f_2 = f_1 = f_2$ so the mixture has the same distribution as $X_1$ and $X_2$, but $\tilde X = aX_1 + (1-a)X_2 \sim \mathcal N(0,1/2)$ is not a standard normal r.v.

Related Question