[Math] Converting piecewise CDF to PDF

probabilityprobability distributions

For instance, I have the following CDF that I am trying to find the mean for, by firstly finding the PDF and then integrating over the interval.

$$F(x) = \begin{cases}
0 & x\leq -2 \\
\frac{x+2}{4} & -2\leq x < 0 \\
\frac{x^2 + 1}{2} & 0\leq x < \frac{1}{2} \\
\frac78 & \frac{1}{2} \leq x < 3 \\
1 & \text{otherwise}
\end{cases}$$

I understand that I need to differentiate $F(x)$ for each piece, however, for the 3rd piece, do I need to subtract the area of the first two functions first? Or do I differentiate normally yielding the pdf below?

$$f(x) = \begin{cases}
0, & x\leq -2 \\
\frac{1}{4}, & -2\leq x < 0 \\
x, & 0\leq x < \frac{1}{2} \\
0 & \text{otherwise}
\end{cases}$$

Best Answer

You're actually going to have a problem with this approach: the original CDF isn't continuous, which means that your random variable isn't continuous and won't have a genuine density function. You can see what the issue will be by considering, for instance, the behavior at $x = 3$; note that the behavior of the CDF implies that $\mathbb P(X = 3) = 1/8$, which isn't something that continuous random variables do.

If your random variable was continuous, your approach would have been perfect for finding the density; but if you integrate the density you obtained, note that it doesn't give a total area of 1. You'll need to change your approach somewhat to proceed....


So, the original idea (which was good and will work in many important situations!) didn't work here. What to do?

This random variable is a hybrid of a discrete and a continuous random variable. It has discrete jumps at $x = 1/2$ and $x = 3$, but its CDF is continuous everywhere else.

The "density" you found will be helpful, because even though it doesn't have a total area of $1$, its area is $5/8$. Note also that the respective probabilities of the random variable being $1/2$ and $3$ are $1/4$ and $1/8$ -- meaning that in total, we've accounted for all the probability ($5/8 + 1/4 + 1/8 = 1$).

Remember what expected value is supposed to be: for a discrete variable, it's $$\sum_{\text{all possible values}} \text{[value]} \cdot \mathbb P(X \text{ assumes that value})$$ and for a continuous variable it's $$\int_{-\infty}^{\infty} x \cdot f(x) \, \textrm d x$$ where $f(x)$ is the density function. You might guess (correctly) that a mixed approach will work here:

$$\mathbb E[X] = \frac 1 2 \cdot \mathbb P(X = 1/ 2) + 3 \cdot \mathbb P(X = 3) + \int_{-\infty}^{\infty} x \cdot f(x) \, \textrm d x$$ where $f(x)$ is the not-quite-a-density-function you correctly found in your original post.

I think this approach is challenging to prove / justify at a non-measure-theoretic level, but hopefully it at least makes some intuitive sense; you're capturing the full range of what $X$ can do, and appropriately weighting each possible value of $X$ against its density function.

Related Question