Given a Beta Distribution with a=2, b=3, we can find an expected value (mean) for the interval [0, 1] = a/(a+b) = 2/5 = 0.4 and median = (a – 1/3)/(a+b-2/3) = 0.39, which are close.
I am looking for a solution in python. I can use scipy.stats.beta to calculate median for the interval [0, 0.4] with percent point function (inverse of cdf — percentiles):
beta.ppf(0.4/2,a,b) = 0.2504
Since for this beta distribution, the overall mean and median are close (0.4 and 0.39 respectively), I use the median for the interval [0, 0.4] to estimate the expected values (mean) for the interval [0, 0.4].
Is there any way to calculate expected values (mean) for the interval [0, 0.4]?
Best Answer
Note that the formula you have near the top there for the beta median ($\frac{\alpha-\frac13}{\alpha+\beta-\frac23}$) is approximate. You should be able to compute an effectively "exact" numerical median with the inverse cdf (quantile function) of the beta distribution in Python (for a $\text{beta}(2,3)$ I get a median of around $0.3857$ while that approximate formula gives $0.3846$).
This mean of a truncated distribution is pretty straightforward with a beta. For a positive random variable we have
$E(X|X<k) = \int_0^k x\,f(x)\, dx / \int_0^k f(x)\, dx$
where in this case $f$ is the density of a beta with parameters $\alpha$ and $\beta$ (which I'll now write as $f(x;\alpha,\beta)$):
$f(x;\alpha,\beta)=\frac{1}{B(\alpha,\beta)} x^{\alpha-1} (1-x)^{\beta-1}\,,\:0<x<1,\alpha,\beta>0$
Hence $x\,f(x) = \frac{B(\alpha+1,\beta)}{B(\alpha,\beta)} f(x;\alpha+1,\beta)=\frac{\alpha}{\alpha+\beta} f(x;\alpha+1,\beta)$
So $E(X|X<k) = \frac{\alpha}{\alpha+\beta} \int_0^k f(x;\alpha+1,\beta)\, dx / \int_0^k f(x;\alpha,\beta)\,dx$
Now the two integrals are just beta CDFs which you have available in Python already.
With $\alpha=2,\beta=3,k=0.4$ we get $E(X|X<0.4)\approx 0.24195$. This is consistent with simulation ($10^6$ simulations gave $\approx 0.24194$).
For the median, I get $F^{-1}(\frac12 F(0.4;2,3);2,3)\approx 0.25040$, which is again consistent with simulation ($10^6$ simulations gave $\approx 0.25038$).
The two are pretty close in this case but that's not a general result; they may sometimes be more substantially different.