Solved – Calculate expectation from empirical cdf

distributionsexpected value

Mirror thread on Math.SE

I have a empirical cumulative probability distribution function for a random variable. The random variable is "time to failure" and I have the full curve i.e till the probability reaches 1. I want to know Mean Time To Failure i.e expectation of that random variable. Is there any standard method to find mean from an empirical distribution.

I am getting the empirical CDF (as discrete values) as output from a "model checking tool" which uses iterative numerical computation techniques to get those probabilities. For example, let F(x)=P(X<=t) is the CDF of the random variable X where X stands for time between failure. To plot the curve of F(X) vs t I am varying t with some step size, calculating F(X) for that t using the "model checking tool" and adding the points to get the curve. I can use small step size to get the more accurate curve. So, I have access to only this CDF values at different t. From this values I want to do a good estimate of mean value of X.

Best Answer

I know this is an older question, but others may find the answer helpful.

Given the empirical CDF, $F_n(x)$, call the percent points of the CDF $\alpha$ (which range from $0$ to $1$) and their corresponding values VaR$_\alpha$ (Value at Risk). VaR$_\alpha$ is simply $F_n^{-1}(\alpha)$ You can use the fact that: $$ E(X) = \int_0^1 VaR_\alpha\;d\alpha $$

This is actually the dual of the relationship @Macro stated, however, instead of adding up vertical slices across the x-axis from $F(x)$ to $1$, you are adding horizontal slices from the y-axis to F(x) up to $y=1$. It is the same area.

To actually do the integration, I'd recommend the trapezoid rule, so given $n$ entries in the CDF we have: $$ E(X) \approx \sum_{k=0}^{n-1} \frac{VaR_{k+1} + VaR_{k}}{2}\cdot\left(\alpha_{k+1}-\alpha_{k}\right) $$

Related Question