Solved – Logarithm of incomplete Beta function for large $\alpha,\beta$

beta distributionr

R's function pbeta is supposed to calculate the incomplete regularized Beta function. If the flag log=TRUE is passed as an argument, the logarithm of the result is returned instead.

I found the following numerical error. Evaluating:

pbeta(0.5555555, 1925.74, 33.7179, log=TRUE)

returns -Inf, which is obviously wrong. The correct result is (calculated using Mathematica):

$$\log\mathrm{I}_{0.5555555}(1925.74,33.7179) = -994.767594138466967$$

For all the other values I have tested, pbeta gives the correct answer. I've inspected the source code of pbeta but cannot pin-point the error.

I need an accurate method to compute the logarithm of the incomplete Beta function for large values of $\alpha,\beta$ (on the order of thousands). Can someone suggest a fix on pbeta or a workaround? Or perhaps a different library?

R bug report: https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=16332

Best Answer

The Beta incomplete function can be written with the help of the Gauss hypergeometric function: $$B_x(a,b)=\frac{1}{a}x^a{(1-x)}^b F(a+b,1,a+1,x)$$ and the CDF of the $Beta(a,b)$ distribution evaluated at $x$ is $$B_x(a,b)/B(a,b).$$ A good implementation of the Gauss hypergeometric function is provided in the gsl package - a wrapper for the special functions of the Gnu Scientific Library. So you can write the log-CDF like this:

library(gsl)
logpbeta <- function(x,a,b) log(hyperg_2F1(a+b,1,a+1,x)) + a*log(x)+b*log(1-x)-log(a) - lbeta(a,b)

And it gives the same result as Mathematica for your example:

> logpbeta(0.5555555, 1925.74, 33.7179)
[1] -994.7676

I don't know for which range the parameters the result is correct. And I have not find the answer in the GNU reference manual.

Note that the zero you obtain with pbeta seems to be due to the evaluation of $B_x(a,b)$ followed by the log-transformation:

> x <- 0.5555555 
> a <- 1925.74
> b <- 33.7179
> log(hyperg_2F1(a+b,1,a+1,x)*x^a*(1-x)^b/a)
[1] -Inf
> hyperg_2F1(a+b,1,a+1,x)
[1] 2.298761
> x^a*(1-x)^b/a
[1] 0

Related Solutions

Solved – How to (numerically) approximate values for a beta distribution with large alpha & beta

A Normal approximation works extremely well, especially in the tails. Use a mean of $\alpha/(\alpha+\beta)$ and a variance of $\frac{\alpha\beta}{(\alpha+\beta)^{2} (1+\alpha+\beta)}$. For example, the absolute relative error in the tail probability in a tough situation (where skewness might be of concern) such as $\alpha = 10^6, \beta = 10^8$ peaks around $0.00026$ and is less than $0.00006$ when you're more than 1 SD from the mean. (This is not because beta is so large: with $\alpha = \beta = 10^6$, the absolute relative errors are bounded by $0.0000001$.) Thus, this approximation is excellent for essentially any purpose involving 99% intervals.

In light of the edits to the question, note that one does not compute beta integrals by actually integrating the integrand: of course you'll get underflows (although they don't really matter, because they don't contribute appreciably to the integral). There are many, many ways to compute the integral or approximate it, as documented in Johnson & Kotz (Distributions in Statistics). An online calculator is found at http://www.danielsoper.com/statcalc/calc37.aspx. You actually need the inverse of this integral. Some methods to compute the inverse are documented on the Mathematica site at http://functions.wolfram.com/GammaBetaErf/InverseBetaRegularized/. Code is provided in Numerical Recipes. A really nice online calculator is the Wolfram Alpha site (www.wolframalpha.com ): enter inverse beta regularized (.005, 1000000, 1000001) for the left endpoint and inverse beta regularized (.995, 1000000, 1000001) for the right endpoint ($\alpha=1000000, \beta=1000001$, 99% interval).

Solved – Approximation for Beta distribution when alpha is less than 10

Johnson and Kotz point out that

The $B$ (Beta) and $F$ distributions are closely related: When $k$ has a $B(\nu_1, \nu_2)$ distribution, then $f$ has an $F(2\nu_1, 2\nu_2)$ distribution where

$$f = \frac{k \nu_2}{(1-k)\nu_1}.$$
The Wilson-Hilferty approximation to $\chi^2$ (or, equivalently, a $\Gamma$ distribution)--which has "remarkable accuracy"--can be applied to $F$, which itself is a ratio of $\chi^2$ distributions.

The result is that

$$z = \frac{\sqrt[3]{f} \left(1-\frac{2}{9 \nu_2}\right)+\left(\frac{2}{9 \nu_1}-1 \right)}{\sqrt{\frac{2 f^{2/3}}{9 \nu_2}+\frac{2}{9 \nu_1}}}$$

has approximately a standard Normal distribution. This provides relatively fast calculations of the CDF of either the $B$ or the $F$ distribution and is not difficult to invert.

This approximation works fairly well for $\nu_2 \ge 1000$ and $\nu_1=1$ and progressively better as $\nu_1$ increases. Here is a set of plots of the difference between the approximate CDF and the correct CDF for $\nu_2=1000$ and $\nu_1 = 1,2,3,6,10$, shown with identical scales on their axes:

CDF error plots

The error is extremely low in the right tail, where interest often focuses, but even in the left tail it rarely exceeds $0.002$ once $\nu_1 \ge 2$. As $\nu_2$ increases, the errors remain about the same.

Given the level of accuracy of this approximation, it makes little sense to compute the standard Normal CDF (or its inverse) to double-precision accuracy. This will allow for even faster computation.

One approach is to create a table of values and interpolate within it. Because the Normal CDF is so curved near the tails, it would seem a largish table might be needed for high accuracy. However, by considering Mills Ratio one might be led instead to create a (very) small table of pairs $(z, \sqrt{-2\log(\Phi(z))}$ for $z$ in the lower tail, say for $-8 \le z \le 0$, where $\Phi$ is the standard Normal CDF. Cubic spline interpolation does a great job even when these pairs are evaluated only for integral $z$ (nine values total in the entire table). To further simplify the programming (if a cubic spline interpolation routine is not readily available) one could implement a quadratic spline. Even better--and barely more difficult--is to form the weighted average of a linear interpolator (which tends to underestimate the CDF) and a quadratic interpolator (which overestimates the CDF). A weight of $0.238$ applied to the linear interpolator works well:

Comparison with normal CDF

This plot shows the differences between three interplators and the standard Normal CDF: the red one is for the linear interpolator, the gold for the quadratic interpolator, and the blue (in between) for their weighted average.

(The same technique of averaging in a linear interpolator also decreases the error by an order of magnitude when using a cubic spline.)

The weighted average is another quadratic interpolator but it is no longer a spline (it fails to have continuous first derivatives at the negative integers)--but so what? In effect, it computes the CDF with absolute accuracy better than $0.00005$ for $z\le -1$ and, by symmetry, for $z\ge 1$. Quadratic or cubic interpolation between a small number of pairs of $(z, \Phi(z))$ should work fine for $-1 \lt z \lt 1$ if such calculations are needed. (Typically only very low accuracy, say to about $0.01$, is needed in this range anyway.) The upshot is that the CDF (or, using similar methods, its inverse) can be computed to sufficient accuracy using about a dozen simple arithmetic operations and one exponentiation, at a cost of storing around a dozen precomputed values: even on an interpreted platform like Java that should go very quickly, taking a small fraction of a microsecond.

Examples

Mathematica code to produce the first figure follows.

fDist[f_, {ν1_, ν2_}] := CDF[NormalDistribution[]][((1-2/(9 ν2)) f^(1/3) - (1-2/(9 ν1))) /
    Sqrt[2 f^(2/3)/(9 ν2) + 2/(9 ν1)]];
f[k_, {ν1_, ν2_}] := (k ν2)/((1 - k) ν1);
With[{ν1 = 1, ν2 = 1000},
 Plot[fDist[f[k,{ν1,ν2}], {2 ν1,2 ν2}] - CDF[FRatioDistribution[2 ν1,2 ν2]][f[k,{ν1,ν2}]], 
    {k, 0, 10 ν1/(ν1 + ν2)}, PlotRange -> Full]
 ]

Code to perform the interpolation for $|z| \ge 1$ and plot its error:

d = Table[{z, Sqrt[-2 Log[CDF[NormalDistribution[], z]]]}, {z, -8, -0, 1}];
f1 = Interpolation[d, InterpolationOrder -> 1];
f2 = Interpolation[d, InterpolationOrder -> 2];
f[z_] := 0.238 f1[z] + (1 - 0.238) f2[z];
Plot[{Exp[-f[z]^2/2 ] - CDF[NormalDistribution[], z]}, {z, -4, -1}, PlotRange -> Full]

References

Norman L. Johnson and Samuel Kotz, Continuous Univariate Distributions - 2. J. Wiley & Sons, New York, 1970.