Solved – beta-binomial distribution with R

beta distributionbeta-binomial distributionrrandom-generation

I am studying an experiment of the kind:
Let $n_{ij}$ be the number of fetuses, $X_{ij}$ the number of responses i.e. the number of fetuses with a malformation in the jth litter of the ith dose level for j=1,…,25 and i=1,…,5 .
Then, $p_{ij}$ is the probability of response of in the jth litter of the ith dose level and hence we have:
$$
P(X_{ij} = x_{ij}|p_{ij}) \sim Bin(n_{ij},p_{ij})
$$
But the probability of response $p_{ij}$ follow a beta distribution hence
$$
P(p_{ij})=B^{-1}(\alpha_i , \beta_i )p_{ij}^{x_{ij}}(1-p_{ij})^{n_{ij}-x_{ij}}
$$
and hence, at the end, $X_{ij}$ follow a beta-binomial distribution.

My problem is that I have to generate the number of responses $X_{ij}$ but I'm having some troubles.

The data that I have are all the $n_{ij}$ and $p_1, p_2, p_3, p_4, p_5$ (and this probabilities follow a logistic model) i.e. the probability response for each dose group, hence I don't have $p_ij$.

What should I do? I think that I should first generate $p_{ij}$ using the fact that they follow a beta distribution. But in which way should I do? How to estimate the parameter $\alpha_i$ and $\beta_i$?

Maybe someone has some ideas..
Thank you in advance!

Best Answer

In a Bayesian approach, you would not estimate $\alpha_i$ and $\beta_i$, but these would be prior beliefs supplied by you.

I am not so sure what $i$ and $j$ in your problem exactly represent, but here is some basline code for the Beta-Binomial distribution that hopefully gets you started. It is similar in spirit to a Gibbs sampler:

rm(list=ls())
library(VGAM)

R <- 10000
T <- 20000
x <- matrix(NA,R+T,2)

n <- 10
alpha <- 7
beta <- 19

x[1,] <- c(1,.5)

for(i in 2:(R+T)) {
  x[i,2] <- rbeta(1,alpha,beta)
  x[i,1] <- rbinom(1,n,x[i,2])
}
x <- x[(R+1):(R+T),]

plot(table(x[,1])/T, col="sienna4",type="p",pch=20)
betabinomialdensity <- function(x,n,alpha,beta) {choose(n, x)*beta(alpha+x,beta+n-x)/beta(alpha,beta)}
points(0:n,betabinomialdensity(0:n,n,alpha,beta),type="o", pch=22, lty=2, col="red")     

AvgDraws = mean(x[,1])
TheoreticalExpectedValue = n*alpha/(alpha+beta)

Related Solutions

Beta Distribution – Understanding Intuition for Beta Distribution with Alpha and/or Beta Less Than 1

Here is a frivolous example that may have some intuitive value.

In US Major League Baseball each team plays 162 games per season. Suppose a team is equally likely to win or lose each of its games. What proportion of the time will such a team have more wins than losses? (In order to have symmetry, if a team's wins and losses are tied at any point, we say it is ahead if it was ahead just before the tie occurred, otherwise behind.)

Suppose we look at a team's win-loss record as the season progresses. For our team with wins and losses are as if determined by tosses of a fair coin, you might think a team would most likely be ahead about half the time throughout a season. Actually, half the time is the least likely proportion of time for being ahead.

The "bathtub shaped" histogram below shows the approximate distribution of the proportion of time during a season that such a team is ahead. The curve is the PDF of $\mathsf{Beta}(.5,.5).$ The histogram is based on 20,000 simulated 162-game seasons for a team where wins and losses are like independent tosses of a fair coin, simulated in R as follows:

set.seed(1212);  m = 20000;  n = 162;  prop.ahead = numeric(m)
for (i in 1:m)
 {
 x = sample(c(-1,1), n, repl=T);  cum = cumsum(x)
 ahead = (c(0, cum) + c(cum,0))[1:n]  # Adjustment for ties
 prop.ahead[i] = mean(ahead >= 0)
 }

cut=seq(0, 1, by=.1); hdr="Proportion of 162-Game Season when Team Leads"  
hist(prop.ahead, breaks=cut, prob=T, col="skyblue2", xlab="Proportion", main=hdr)
curve(dbeta(x, .5, .5), add=T, col="blue", lwd=2)

Note: Feller (Vol. 1) discusses such a process. The CDF of $\mathsf{Beta}(.5,.5)$ is a constant multiple of an arcsine function, so Feller calls it an 'Arcsine Law'.

Solved – Bayesian update for Beta distribution

As already noticed by @whuber in a comment to answer by @BruceET, this is not really a Bayesian scenario, since you don't seem to mention any data (nor any likelihood).

From what you are saying, you know that $p \sim \mathsf{Beta}(a, b)$, you also know that $p \ge 1/2$, what translates to knowing that $p$ is distributed according to beta distribution with parameters $a,b$ left truncated at $1/2$.

Same with the Dirichlet distribution, your knowledge that $p_1+p_2\geq p_3+p_3$ is a constraint about the distribution, not an "update" of the prior. Moreover, notice that this constraint leads to situation that may not be possible under Dirichlet distribution, so in fact the statements may be contradictory. The statement is in fact, that the $p_1, p_2, p_3, p_4$ are distributed according to distribution similar to Dirichlet, but constrained.

So...

If you are saying that for $p$ you assume truncated beta distribution as a prior, and want to use it together with some likelihood function and data, it is no more conjugate to binomial distribution, so you would need to use Markov Chain Monte Carlo for estimation. Defining truncated distribution can be done in any probabilistic programming framework, e.g. Stan, PyMC3, JAGS etc.
Same as above applies to the "Dirichlet"-like distribution, but since this is a custom distribution, it would be much more complicated (I have no easy solution for you).
If you are saying that the facts mentioned by you are the only information that you have and will have, and given this information you want to learn something about the distribution (e.g. expected value, quantiles), then this is a typical case of standard Monte Carlo simulation. For truncated beta, you could simply use inverse transform sampling, that is a simple and efficient way of sampling. For the "Dirichlet"-like distribution, it would again, be more complicated, but there are many possible approaches, starting from simple accept-reject sampling, ending at some more sophisticated solutions.

Best Answer

Related Solutions

Beta Distribution – Understanding Intuition for Beta Distribution with Alpha and/or Beta Less Than 1

Solved – Bayesian update for Beta distribution

Related Question