Solved – R: Which distribution to use with gbm for gamma distributed data

boostinggamma distributionr

When I use GLMs I can use the option family="Gamma" for analysing data consisting of positive real numbers. Also package gbm provides a large number of distributions to choose from, but there's none that matches the gamma distribution. Which distribution should I choose?

Best Answer

The distribution gamma are available in both gbm (only for the github version https://github.com/gbm-developers/gbm , not in the CRAN version) and mboost package.

For the package gbm, simply specify distribution = 'gamma' in the parameters of gbm function.

For the package mboost, use gamma distribution specifying family = GammaReg() in the options of the function mboost as shown in the toy example below :

library(mboost)
n.obs  <- 1000
n.iter <- 100
x1     <- rgamma(n.obs, shape = 1, scale = 1)
x2     <- rgamma(n.obs, shape = 2, scale = 1)
y      <- x1 + x2
model  <- mboost(formula = y ~ x1 + x2, data = data.frame(y, x1, x2),
                 baselearner = "btree", family = GammaReg(), 
                 control = boost_control(mstop = n.iter))