Finding density function $f$ of cumulative distribution function $F$

probabilityprobability distributions

For an unknown continuous probability density function $f$ on $\mathbb{R}^2$ let $$ F(x,y) := \int_{-\infty}^{x}\int_{-\infty}^{y}f(x,y)dydx,$$ be cumulative distribution function with $x,y\in \mathbb{R}.$ How can we find the density $f$?

The only thing, that comes to my mind is that we should firstly calculate derivate with the respect to x and then to y, but that seems too easy.

Best Answer

Even if $f$ is continuous, the function $F$ can be continuous in each variable without being totally differentiable. So you can't do the derivative thing.

But you can do this : think of $F(A)$ as the "measure" of $A$ with respect to $F$. Then, $F(x,y)$ is the "measure" of $(-\infty,x] \times (-\infty,y]$ with respect to $F$.

Now, what does the density $f(x,y)$ at a point $(x_0,y_0)$ hint at, or mean? It means, that if I take a very small region $V$ containing $F(x,y)$, the "measure" with respect to $F$ of $V$ should be $f(x,y)$ times the area of $V$ as a geometrical region of the plane.

In particular, if I take rectangles around $(x_0,y_0)$, then $f(x,y)$ times the area of these rectangles, should be the "measure" of these rectangles with respect to $F$.

Suppose I have a rectangle $[x_0-\epsilon,x_0 + \epsilon] \times [y_0- \epsilon,y_0 + \epsilon]$ around $(x_0,y_0)$. Its area is clearly $4 \epsilon^2$ (the usual formula : product of sides).

What is its "measure" under $F$? For this, draw a diagram , and convince yourself that the "measure" of $[x_0-\epsilon,x_0 + \epsilon] \times [y_0- \epsilon,y_0 + \epsilon]$ is equal to : $$ F(x_0+\epsilon, y_0+\epsilon) - F(x_0+\epsilon,y_0-\epsilon) - F(x_0-\epsilon,y_0+\epsilon) + F(x_0-\epsilon,y_0-\epsilon) $$

To see this, interpret each term in the sum above in terms of the region that they are the $F$-measure of. Add/subtract regions which overlap based on their sign , and you will see that only the rectangle around $(x_0,y_0)$ remains.

Therefore, the result is, or at least should be : $$ f(x,y) = \lim_{\epsilon \to 0}\frac{F(x_0+\epsilon, y_0+\epsilon) - F(x_0+\epsilon,y_0-\epsilon) - F(x_0-\epsilon,y_0+\epsilon) + F(x_0-\epsilon,y_0-\epsilon)}{4 \epsilon^2} $$

Use everything you know about derivatives, like the FTC etc. to see this. Note that we don't require differentiability of $F$, but only something weaker which is implied by its form (partial derivatives).

However, IF $F$ is once differentiable with continuous derivative in each variable, then you can show that $$ f(x_0,y_0) = \frac{\partial^2 F}{\partial x \partial y} (x_0,y_0) $$

In any case, the RHS depends only on $F$ : so you can find $F$ assuming the RHS limit exists at each point (and it does exist almost everywhere due to one-dimensional CDF monotonicity).

Related Question