Form of the Jacobian factor in nonlinear change of variable for probability densities

change-of-variabledensity functionjacobianmultivariable-calculusreal-analysis

I was reading this section about transformations in probability:

Under a nonlinear change of variable, a probability density transforms differently from a simple function, due to the Jacobian factor. For instance, if we consider a change of variables $x = g(y)$, then a function $f(x)$ becomes $\tilde{f}(y) = f(g(y))$. Now consider a probability density $p_x(x)$ that corresponds to a density $p_y(y)$ with respect to the new variable $y$, where the suffices denote the fact that $p_x(x)$ and $p_y(y)$ are different densities. Observations falling in the range $(x, x + \delta x)$ will, for small values of $\delta x$, be transformed into the range $(y, y + \delta y)$ where $p_x(x) \delta x \simeq p_y(y) \delta y$, and hence

$$\begin{align} p_y(y) &= p_x(x) \left\vert \dfrac{dx}{dy} \right\vert \\ &= p_x(g(y)) |g'(y)|. \end{align}$$

It seems to me that we get $p_y(y) = p_x(x) \left\vert \dfrac{dx}{dy} \right\vert$ by the following calculation:

$$\begin{align} p_x(x) \delta x = p_y(y) \delta y \Rightarrow p_y(y) &= p_x(x) \dfrac{\delta x}{\delta y} \\ &= p_x(x) \dfrac{dx}{dy} \end{align}$$

But why is it implied that we must have the absolute value $\left\vert \dfrac{dx}{dy} \right\vert$, rather than just $\dfrac{dx}{dy}$? After all, the values $x$ and $y$ can also be negatives, right? Or is it just a matter of convention that they're positive, hence the result?

Thank you.

I just saw this in some notes:

Let $X$ be a random variable.

For continuous $X$,

$$\begin{align} F_Y(y) &= P(Y \le y) = P(h^{-1}(Y) \le h^{-1}(y)) \\ &= P(X \le h^{-1}(y)) = F_X(h^{-1}(y)) \end{align}$$

Taking derivatives of both sides (assume that $h^{-1}$ is differentiable) and yield the pdf of $Y$

$$f_Y(y) = \dfrac{d}{dy}F_Y(y) = \dfrac{d}{dy}F_X(h^{-1}(y)) = f_X(h^{-1}(y)) \times \dfrac{d}{dy} h^{-1}(y)$$

Allowing $h(x)$ to be also a decreasing function,

$$f_Y(y) = f_X(h^{-1}(y)) \times \left\vert \dfrac{d}{dy} h^{-1}(y) \right\vert$$

This seems to give us clues (the presence of the absolute value has something to do with accounting for the fact that $h(x)$ can also be a decreasing function), but it doesn't explain why we the absolute values are necessary. This is what I'm interesting in understanding. Does anyone else understanding what's going on here?

Best Answer

Consider what happens when you integrate your probability density $p_x$ over an interval $[a,b]$ where $g$ is monotonic:

$$\int\limits_a^b p_x(x)\,dx=\int\limits_{g^{-1}(a)}^{g^{-1}(b)}p_x(g(y))g'(y)\,dy.$$

If $g$ is decreasing, the integrand $p_x(g(y))g'(y)$ will be non-positive, since $p_x$, being a probability density, is non-negative. But under the new variable $y$ this integrand could be interpreted in terms of the probability density over $y$. To do this, we should switch limits of integration, so that the lower limit is smaller than the upper, and the integrand can become non-negative. Then we get:

$$\int\limits_a^b p_x(x)\,dx=\int\limits_{g^{-1}(b)}^{g^{-1}(a)}-p_x(g(y))g'(y)\,dy=\int\limits_{g^{-1}(b)}^{g^{-1}(a)}p_x(g(y))|g'(y)|\,dy.$$

Now, if $g$ isn't decreasing, there shouldn't be any switching of limits and taking of absolute value. But for your text it doesn't matter, since there is no attempt at integration—the text just defines the new probability distribution. Absolute value is sufficient for this purpose. The resulting distribution $p_y$ is supposed to be integrated in the "right" direction—from low to high parameter values.

Related Question