According to the document, Matlab estimates the parameters for Beta distribution with the maximum likelihood. In this case, I am afraid that the sampling distributions for the estimators are not available. Let $\{X_i\}_{i=1}^n$ be a set of i.i.d. observations from standard Beta distribution $B(\alpha,\beta)$, obtaining the MLEs $\hat{\alpha}$ and $\hat{\beta}$ requires one to solve simultaneous equations \begin{align}\psi(\hat{\alpha})-\psi(\hat{\alpha}+\hat{\beta})&=\frac{1}{n}\sum_{i=1}^n\ln (X_i),\\\psi(\hat{\beta})-\psi(\hat{\alpha}+\hat{\beta})&=\frac{1}{n}\sum_{i=1}^n\ln (1-X_i),\end{align} where $\psi$ denotes the digamma function (see e.g. Gnanadesikan, Pinkham, and Hughes' Maximum Likelihood Estimation of the Parameters of the Beta Distribution from Smallest Order Statistics for more details). As far as I know, no general analytical solutions are available, let alone the exact distributions of $\hat{\alpha}$ and $\hat{\beta}$. It is possible to solve them in special cases however, e.g. when $\beta$ are known to be exactly $1$, by properties of digamma function the first equation can be reduced to $$\hat{\alpha}=-\frac{n}{\sum_{i=1}^n\ln (X_i)},$$ where one can derive that $-\sum_{i=1}^n\ln (X_i)\sim\varGamma(n,1/\alpha)$, which implies $\hat{\alpha}$ has scaled inverse-Gamma distribution. Other than that, one has to resort to the asymptotic normality of MLEs to make inferences on $\hat{\alpha}$ and $\hat{\beta}$, also see e.g. Lau and Lau's Effective procedures for estimating beta
distribution's parameters and Their confidence
intervals for procedure using Pearson distribution. In addition, one may utilize the normality to derive confidence intervals for the mean, since $$\hat{\mu}=\frac{\hat{\alpha}}{\hat{\alpha}+\hat{\beta}}$$ will have correlated normal ratio distribution if both $\hat{\alpha}$ and $\hat{\beta}$ are normal.
For your second question, I take that logs of estimates are used most likely to cope with the problem of parameter space. Note that both $\alpha$ and $\beta$ must be in $(0,\infty)$. But if your $\hat{\alpha}=1$ and you use normal approximation directly, you may end up with confidence interval such as $(-0.5,2.5)$. You can get around this by mapping to log-scale and then back, though I am not sure how accurate the result will be then.
As an interesting side note, the book (Hahn and Shapiro's Statistical Models in Engineering) cited in the documentation actually talked about method of moments estimate instead of MLE, it is not clear to me why it is cited here.
Note that the formula you have near the top there for the beta median ($\frac{\alpha-\frac13}{\alpha+\beta-\frac23}$) is approximate. You should be able to compute an effectively "exact" numerical median with the inverse cdf (quantile function) of the beta distribution in Python (for a $\text{beta}(2,3)$ I get a median of around $0.3857$ while that approximate formula gives $0.3846$).
This mean of a truncated distribution is pretty straightforward with a beta. For a positive random variable we have
$E(X|X<k) = \int_0^k x\,f(x)\, dx / \int_0^k f(x)\, dx$
where in this case $f$ is the density of a beta with parameters $\alpha$ and $\beta$ (which I'll now write as $f(x;\alpha,\beta)$):
$f(x;\alpha,\beta)=\frac{1}{B(\alpha,\beta)} x^{\alpha-1} (1-x)^{\beta-1}\,,\:0<x<1,\alpha,\beta>0$
Hence $x\,f(x) = \frac{B(\alpha+1,\beta)}{B(\alpha,\beta)} f(x;\alpha+1,\beta)=\frac{\alpha}{\alpha+\beta} f(x;\alpha+1,\beta)$
So $E(X|X<k) = \frac{\alpha}{\alpha+\beta} \int_0^k f(x;\alpha+1,\beta)\, dx / \int_0^k f(x;\alpha,\beta)\,dx$
Now the two integrals are just beta CDFs which you have available in Python already.
With $\alpha=2,\beta=3,k=0.4$ we get $E(X|X<0.4)\approx 0.24195$. This is consistent with simulation ($10^6$ simulations gave $\approx 0.24194$).
For the median, I get
$F^{-1}(\frac12 F(0.4;2,3);2,3)\approx 0.25040$, which is again consistent with simulation ($10^6$ simulations gave $\approx 0.25038$).
The two are pretty close in this case but that's not a general result; they may sometimes be more substantially different.
Best Answer
While there are specific methods for calculating confidence intervals for the parameters in a beta distribution, I’ll describe a few general methods, that can be used for (almost) all sorts of distributions, including the beta distribution, and are easily implemented in R.
Profile likelihood confidence intervals
Let’s begin with maximum likelihood estimation with corresponding profile likelihood confidence intervals. First we need some sample data:
The real/theoretical mean is
Now we have to create a function for calculating the negative log likelihood function for a sample from the beta distribution, with the mean as one of the parameters. We can use the
dbeta()
function, but since this doesn’t use a parametrisation involving the mean, we have have to express its parameters (α and β) as a function of the mean and some other parameter (like the standard deviation):To find the maximum likelihood estimate, we can use the
mle()
function in thestats4
library:Just ignore the warnings for now. They’re caused by the optimisation algorithms trying invalid values for the parameters, giving negative values for α and/or β. (To avoid the warning, you can add a
lower
argument and change the optimisationmethod
used.)Now we have both estimates and confidence intervals for our two parameters:
Note that, as expected, the confidence intervals are not symmetrical:
(The second-outer magenta lines show the 95% confidence interval.)
Also note that even with just 10 observations, we get very good estimates (a narrow confidence interval).
As an alternative to
mle()
, you can use thefitdistr()
function from theMASS
package. This too calculates the maximum likelihood estimator, and has the advantage that you only need to supply the density, not the negative log likelihood, but doesn’t give you profile likelihood confidence intervals, only asymptotic (symmetrical) confidence intervals.A better option is
mle2()
(and related functions) from thebbmle
package, which is somewhat more flexible and powerful thanmle()
, and gives slightly nicer plots.Bootstrap confidence intervals
Another option is to use the bootstrap. It’s extremely easy to use in R, and you don’t even have to supply a density function:
The bootstrap has the added advantage that it works even if your data doesn’t come from a beta distribution.
Asymptotic confidence intervals
For confidence intervals on the mean, let’s not forget the good old asymptotic confidence intervals based on the central limit theorem (and the t-distribution). As long as we have either a large sample size (so the CLT applies and the distribution of the sample mean is approximately normal) or large values of both α and β (so that the beta distribution itself is approximately normal), it works well. Here we have neither, but the confidence interval still isn’t too bad:
For just slightly larges values of n (and not too extreme values of the two parameters), the asymptotic confidence interval works exceedingly well.