[Math] Single random variable, multiple probability distributions

probabilityprobability distributionsprobability theoryrandom variables

If we have two separate probability distributions P(x) and Q(x) over the same random variable x, we can measure how diﬀerent these two distributions are using the Kullback-Leibler (KL) divergence…

The above statement is from Deep Learning by Ian Goodfellow and Yoshua Bengio and Aaron Courville and I have the following question:

As far as I have understood, a random variable is defined considering a specific probability distribution in mind, it takes the value of a random outcome in that distribution. Perhaps I'm wrong in my understanding. My question is:

How can you have two separate probability distributions on the same random variable?

Kindly help me resolve this confusion. Thanks!

Best Answer

The uniqueness of the distribution of a random variable $\mathbf{x}$ implicitly implies that you consider a given measure $\mathbb{P}$.

Consider a probability space $(\Omega,\mathcal{F}, \mathbb{P})$, and a measurable space $(X,\mathcal{X})$. A random variable $\mathbf{x}$ is defined on $(\Omega,\mathcal{F}, \mathbb{P})$ as a measurable function $\mathbf{x}:~\Omega \to X$. Then $\mathbb{P}_{\mathbf{x}}=\mathbb{P}\circ \mathbf{x}^{-1}$ is a measure, and classically it is defined as the distribution of $\mathbf{x}$. Consider now the probability space $(\Omega,\mathcal{F}, \mathbb{Q})$, and the same function $\mathbf{x}:~\Omega \to X$. Then $\mathbb{Q}_{\mathbf{x}}=\mathbb{Q}\circ \mathbf{x}^{-1}$ is a also a measure. Therefore, if you define the random variable as a function, without a specific measure but only considering the measurable space $(\Omega,\mathcal{F})$, two different measures will give two different distributions. The KL divergence compares two measures, for a single measurable function $\mathbf{x}$.

However classically random variables are defined for a given probability measure $\mathbb{P}$, therefore have a given distribution $\mathbb{P}_{\mathbf{x}}$.

Best Answer

Related Solutions

[Math] Calculating difference between two probability distributions.

Given $y = sx, E[(x-E[x])(y-E[y])] = E[(x-E[x])(sx-E[sx])]$ how to derivate $Cov(x, y) = 0$

Related Question