Solved – Beta regression of proportion data including 1 and 0

beta distributionbeta-regressionmixed modelregressionzero inflation

I am trying to produce a model for which I have a response variable which is a proportion between 0 and 1, this includes quite a few 0s and 1s but also many values in between. I am thinking about attempting a beta regression. The package I have found for R (betareg) only allows values in between 0 and 1 but not including 0 or 1 them selves. I have read elsewhere that theoretically the beta distribution should be able to handle values of 0 or 1 but I do not know how to handle this in R.I have seen some people add 0.001 to the zeros and take 0.001 from the ones, but I am not sure this is a good idea?

Alternatively I could logit transform the response variable and use linear regression. In this case I have the same problem with the 0 and 1's which cannot be log transformed.

Best Answer

You could use zero- and/or one inflated beta regression models which combine the beta distribution with a degenerate distribution to assign some probability to 0 and 1 respectively. For details see the following references:

Ospina, R., & Ferrari, S. L. P. (2010). Inflated beta distributions. Statistical Papers, 51(1), 111-126. Ospina, R., & Ferrari, S. L. P. (2012). A general class of zero-or-one inflated beta regression models. Computational Statistics and Data Analysis, 56(6), 1609 - 1623.

These models are easy to implement with the gamlss package for R.

Related Question