Solved – How to calculate a partial expected value of beta distribution (mean of a truncated beta)

beta distributionmeanmedianpythontruncation

Beta Distribution with a=2, b=3, percentile x=0.4

Given a Beta Distribution with a=2, b=3, we can find an expected value (mean) for the interval [0, 1] = a/(a+b) = 2/5 = 0.4 and median = (a – 1/3)/(a+b-2/3) = 0.39, which are close.

I am looking for a solution in python. I can use scipy.stats.beta to calculate median for the interval [0, 0.4] with percent point function (inverse of cdf — percentiles):

beta.ppf(0.4/2,a,b) = 0.2504

Since for this beta distribution, the overall mean and median are close (0.4 and 0.39 respectively), I use the median for the interval [0, 0.4] to estimate the expected values (mean) for the interval [0, 0.4].

Is there any way to calculate expected values (mean) for the interval [0, 0.4]?

Best Answer

Note that the formula you have near the top there for the beta median ($\frac{\alpha-\frac13}{\alpha+\beta-\frac23}$) is approximate. You should be able to compute an effectively "exact" numerical median with the inverse cdf (quantile function) of the beta distribution in Python (for a $\text{beta}(2,3)$ I get a median of around $0.3857$ while that approximate formula gives $0.3846$).

This mean of a truncated distribution is pretty straightforward with a beta. For a positive random variable we have

$E(X|X<k) = \int_0^k x\,f(x)\, dx / \int_0^k f(x)\, dx$

where in this case $f$ is the density of a beta with parameters $\alpha$ and $\beta$ (which I'll now write as $f(x;\alpha,\beta)$):

$f(x;\alpha,\beta)=\frac{1}{B(\alpha,\beta)} x^{\alpha-1} (1-x)^{\beta-1}\,,\:0<x<1,\alpha,\beta>0$

Hence $x\,f(x) = \frac{B(\alpha+1,\beta)}{B(\alpha,\beta)} f(x;\alpha+1,\beta)=\frac{\alpha}{\alpha+\beta} f(x;\alpha+1,\beta)$

So $E(X|X<k) = \frac{\alpha}{\alpha+\beta} \int_0^k f(x;\alpha+1,\beta)\, dx / \int_0^k f(x;\alpha,\beta)\,dx$

Now the two integrals are just beta CDFs which you have available in Python already.

With $\alpha=2,\beta=3,k=0.4$ we get $E(X|X<0.4)\approx 0.24195$. This is consistent with simulation ($10^6$ simulations gave $\approx 0.24194$).

For the median, I get $F^{-1}(\frac12 F(0.4;2,3);2,3)\approx 0.25040$, which is again consistent with simulation ($10^6$ simulations gave $\approx 0.25038$).

The two are pretty close in this case but that's not a general result; they may sometimes be more substantially different.