Random Variable – Intuitive Explanation for Density of Transformed Variable: A Comprehensive Guide

Suppose $X$ is a random variable with pdf $f_X(x)$. Then the random variable $Y=X^2$ has the pdf

$$f_Y(y)=\begin{cases}\frac{1}{2\sqrt{y}}\left(f_X(\sqrt{y})+f_X(-\sqrt{y})\right) & y \ge 0 \\ 0 & y \lt 0\end{cases}$$

I understand the calculus behind this. But I'm trying to think of a way to explain it to someone who doesn't know calculus. In particular, I'm trying to explain why the factor $\frac{1}{\sqrt{y}}$ appears out front. I'll take a stab at it:

Suppose $X$ has a Gaussian distribution. Almost all the weight of its pdf is between the values, say, $-3$ and $3.$ But that maps to 0 to 9 for $Y$. So, the heavy weight in the pdf for $X$ has been extended across a wider range of values in the transformation to $Y$. Thus, for $f_Y(y)$ to be a true pdf the extra heavy weight must be downweighted by the multiplicative factor $\frac{1}{\sqrt{y}}$

How does that sound?

If anyone can provide a better explanation of their own or link to one in a document or textbook I'd greatly appreciate it. I find this variable transformation example in several intro mathematical probability/stats books. But I never find an intuitive explanation with it 🙁

Best Answer

PDFs are heights but they are used to represent probability by means of area. It therefore helps to express a PDF in a way that reminds us that area equals height times base.

Initially the height at any value $x$ is given by the PDF $f_X(x)$. The base is the infinitesimal segment $dx$, whence the distribution (that is, the probability measure as opposed to the distribution function) is really the differential form, or "probability element,"

$$\operatorname{PE}_X(x) = f_X(x) \, dx.$$

This, rather than the PDF, is the object you want to work with both conceptually and practically, because it explicitly includes all the elements needed to express a probability.

When we re-express $x$ in terms of $y = x^2$, the base segments $dx$ get stretched (or squeezed): by squaring both ends of the interval from $x$ to $x + dx$ we see that the base of the $y$ area must be an interval of length

$$dy = (x + dx)^2 - x^2 = 2 x \, dx + (dx)^2.$$

Because the product of two infinitesimals is negligible compared to the infinitesimals themselves, we conclude

$$dy = 2 x \, dx, \text{ whence }dx = \frac{dy}{2x} = \frac{dy}{2\sqrt{y}}.$$

Having established this, the calculation is trivial because we just plug in the new height and the new width:

$$\operatorname{PE}_X(x) = f_X(x) \, dx = f_X(\sqrt{y}) \frac{dy}{2\sqrt{y}} = \operatorname{PE}_Y(y).$$

Because the base, in terms of $y$, is $dy$, whatever multiplies it must be the height, which we can read directly off the middle term as

$$\frac{1}{2\sqrt{y}}f_X(\sqrt{y}) = f_Y(y).$$

This equation $\operatorname{PE}_X(x) = \operatorname{PE}_Y(y)$ is effectively a conservation of area (=probability) law.

Two pdfs

This graphic accurately shows narrow (almost infinitesimal) pieces of two PDFs related by $y=x^2$. Probabilities are represented by the shaded areas. Due to the squeezing of the interval $[0.32, 0.45]$ via squaring, the height of the red region ($y$, at the left) has to be proportionally expanded to match the area of the blue region ($x$, at the right).

Best Answer

Related Solutions

Solved – Distribution for squared random variable

Related Question