Getting $p_y(y) = p_x(g^{-1}(y)) \left| \frac{\partial{x}}{\partial{y}} \right|$ by solving $| p_y(g(x)) \ dy | = | p_x (x) \ dx |$

change-of-variablemeasure-theoryprobability theoryrandom variablesreal-analysis

My textbook has a very brief section that introduces some concepts from measure theory:

Another technical detail of continuous variables relates to handling continuous random variables that are deterministic functions of one another. Suppose we have two random variables, $\mathbf{x}$ and $\mathbf{y}$, such that $\mathbf{y} = g(\mathbf{x})$, where $g$ is an invertible, continuous, differentiable transformation. One might expect that $p_y(\mathbf{y}) = p_x(g^{−1} (\mathbf{y}))$. This is actually not the case.

As a simple example, suppose we have scalar random variables $x$ and $y$. Suppose $y = \dfrac{x}{2}$ and $x \sim U(0,1)$. If we use the rule $p_y(y) = p_x(2y)$, then $p_y$ will be $0$ everywhere except the interval $\left[ 0, \dfrac{1}{2} \right]$, and it will be $1$ on this interval. This means

$$\int p_y(y) \ dy = \dfrac{1}{2},$$

which violates the definition of a probability distribution. This is a common mistake. The problem with this approach is that it fails to account for the distortion fo space introduced by the function $g$. Recall that the probability of $\mathbf{x}$ lying in an infinitesimally small region with volume $\delta \mathbf{x}$ is given by $p(\mathbf{x}) \delta \mathbf{x}$. Since $g$ can expand or contract space, the infinitesimal volume surrounding $\mathbf{x}$ in $\mathbf{x}$ space may have different volume in $\mathbf{y}$ space.

To see how to correct the problem, we return to the scalar case. We need to present the property

$$| p_y(g(x)) \ dy | = | p_x (x) \ dx |$$

Solving from this, we obtain

$$p_y(y) = p_x(g^{-1}(y)) \left| \dfrac{\partial{x}}{\partial{y}} \right|$$

or equivalently

$$p_x(x) = p_y(g(x)) \left| \dfrac{\partial{g(x)}}{\partial{x}} \right|$$

How do they get $p_y(y) = p_x(g^{-1}(y)) \left| \dfrac{\partial{x}}{\partial{y}} \right|$ or equivalently $p_x(x) = p_y(g(x)) \left| \dfrac{\partial{g(x)}}{\partial{x}} \right|$ by solving $| p_y(g(x)) \ dy | = | p_x (x) \ dx |$?

Can someone please demonstrate this and explain the steps?

Best Answer

$p_X(x)dx$ represents the probability measur $\mathbb{P}_X$ which is the probability distribution of the random variable $X$, it is defined by its action on measurable positive functions by $$\mathbb{E}(f(X))=\int_{\Omega}f(X)d\mathbb{P}=\int_{\mathbb{R}}f(x)d\mathbb{P}_X(x)=\int_{\mathbb{R}}f(x)p_X(x)dx.$$ Now, we consider a new random variable $Y=g(X)$, (with some conditions on $g$), and we seek $p_Y$ the probability density distribution of $Y$. So we calculate, for an arbitrary measurable positive function $f$ the expectation $\mathbb{E}(f(Y))$ in two ways: First, $$\mathbb{E}(f(Y))=\int_{\mathbb{R}}f(y)\color{red}{p_Y(y)dy}\tag1$$ Second, $$\eqalignno{\mathbb{E}(f(Y))&=\mathbb{E}(f(g(X)))\cr &=\int_{\mathbb{R}}f(g(x))p_X(x)dx\qquad\text{now a change of variables}\cr &=\int_{\mathbb{R}}f(y)\color{red}{p_X(g^{-1}(y))\left|\frac{dx}{dy}\right|dy}&(2) }$$ Now, because $f$ is arbitrary, comparing (1) and (2) we get $$p_Y(y)=p_X(x)\left|\frac{dx}{dy}\right|, \quad\text{where $y=g(x)$.}$$ Or, better $$p_Y(y)=p_X(g^{-1}(y))\left|\frac{1}{g’(g^{-1}(y))}\right|\iff p_Y(g(x))|g’(x)|=p_X(x).$$

Related Question