Bayesian Regression – Specify a Zero-Inflated (Hurdle) Gamma Model in JAGS/BUGS

bayesianbugsjagsrregression

I'm trying to use a zero-inflated gamma model (or a gamma 'hurdle' model). The model is a mixture of logistic regression and generalized linear modeling. I can do this analysis in two steps: 1) do a logistic regression against presence/absence data and then 2) Use a generalized linear model with a gamma distribution on the positive values. I'm trying to set the model up where, if y = 0, then E(y) = p, but if y > 0, then E(y) is gamma distributed.

I'm trying to set this up in BUGS/JAGS, because I've seen these models worked before for poisson-distributions. I tried to adapt the code from that link into a gamma distribution, but I can't quite seem to get it to work. I know the code won't work because my data have zeroes, and the likelihood function can't be evaluated with zeroes. However, even when restricted to positive values, I get errors for invalid parent values. Even so, I'm not even sure the model is specified correctly.

Here is the model:

# For the ones trick
C <- 10000

# for every observation
for(i in 1:N){
    # log-likelihood of the observation from the gamma likelihood
    LogPos[i] <- (shape[i] - 1)*log(y[i]) - y[i]/scale[i] - N*shape[i]*log(scale[i]) - N*loggam(scale[i])
    #likelihood
    Lpos[i] <- exp(LogPos[i])

    # redefine the shape and rate parameters as a function of the mean and sd
    shape[i] <- pow(mu[i], 2) / pow(sd, 2)
    scale[i] <- mu[i] / pow(sd, 2)

    # mu is a function of MTD: use the inverse link
    #mu[i] <- 1/eta[i]
    mu[i] <- beta0 + beta1*MTD[i]


    # zero-inflated part, where w[i] is the probability of being zero
    w[i] <- exp(zeta[i]) / (1 + exp(zeta[i]))
    zeta[i] <- gamma0 + gamma1*MPD[i]

    # ones trick
    p[i] <- Lpos[i] / C
    ones[i] ~ dbern(p[i])

    # Full likelihood
    Lik[i] <- Lpos[i]*(1 - w[i]) + equals(y[i], 0)*w[i]
  } 

# PRIORS
beta0 ~ dnorm(0, 0.001)
beta1 ~ dnorm(0, 0.001)

gamma0 ~ dnorm(0, 0.001)
gamma1 ~ dnorm(0, 0.001)

sd ~ dunif(0, 100)

Has anyone set a model up like this or have any advice on how to set it up correctly?

UPDATE

I've tried a new set of code that's similar, but slight different. I still have not gotten it to work

model{

  # For the ones trick
  C <- 10000

  # for every observation
  for(i in 1:N){

    # make a dummy variable that is 0 if y is < 0.0001 and 1 if y > 0.0001. This is essentially a presence
    # absence dummy variable
    z[i] <- step(y[i] - 0.0001)

    # define the logistic regression model, where w is the probability of occurance.
    # use the logistic transformation exp(z)/(1 + exp(z)), where z is a linear function
    w[i] <- exp(zeta[i]) / (1 + exp(zeta[i]))
    zeta[i] <- gamma0 + gamma1*MPD[i]

    # define the gamma regression model for the mean. use the log link the ensure positive, non-zero mu
    mu[i] <- exp(eta[i])
    eta[i] <- beta0 + beta1*MTD[i]

    # redefine the mu and sd of the continuous part into the shape and scale parameters
    shape[i] <- pow(mu[i], 2) / pow(sd, 2)
    scale[i] <- mu[i] / pow(sd, 2)

    # for readability, define the log-likelihood of the gamma here
    logGamma[i] <- (shape[i] - 1)*log(y[i]) - y[i]/scale[i] - shape[i]*log(scale[i]) - loggam(scale[i])

    # define the total likelihood, where the likelihood is (1 - w) if y < 0.0001 (z = 0) or
    # the likelihood is w * gammalik if y >= 0.0001 (z = 1). So if z = 1, then the first bit must be
    # 0 and the second bit 1. Use 1 - z, which is 0 if y > 0.0001 and 1 if y < 0.0001
    logLik[i] <- (1 - z[i]) * log(1 - w[i]) + z[i] * ( log(w[i]) + logGamma[i] )

    # Use the ones trick
    p[i] <- logLik[i] / C
    ones[i] ~ dbern(p[i])
  } 

  # PRIORS
  beta0 ~ dnorm(0, 0.001)
  beta1 ~ dnorm(0, 0.001)

  gamma0 ~ dnorm(0, 0.001)
  gamma1 ~ dnorm(0, 0.001)

  sd ~ dgamma(1, 2)

}

UPDATE 2

I've gotten it to run by deleting the definition of logGamma[i] and putting it directly into the likelihood function, which now reads:

logLik[i] <- (1 - z[i]) * log(1 - w[i]) + z[i] * ( log(w[i]) + (shape[i] - 1)*log(y[i]) - y[i]/scale[i] - shape[i]*log(scale[i]) - loggam(scale[i]) )

The problem was that JAGS was trying to evaluate the log likelihood at all observations first, resulting in NA's for the 0's. In the new way, it only evaluates it if z = 1 (I think). I'm still having trouble getting the parameter estimates to line up. For example, the gamma's are almost identical to a logistic regression of the same form (hooray!). But the betas are pretty far off from a gamma GLM of the positive values. I don't know if this is normal or not, but I suspect there are still problems with my model specification.

Best Answer

I figured out the answer, with help from Martyn Plummer. My code uses the inverse link for the gamma model (and no inverse of the predictors). Also, this code requires the 'glm' module for JAGS.

model{

  # For the ones trick
  C <- 10000

  # for every observation
  for(i in 1:N){

    # define the logistic regression model, where w is the probability of occurance.
    # use the logistic transformation exp(z)/(1 + exp(z)), where z is a linear function
    logit(w[i]) <- zeta[i]
    zeta[i] <- gamma0 + gamma1*MPD[i] + gamma2*MTD[i] + gamma3*int[i] + gamma4*MPD[i]*int[i] + gamma5*MTD[i]*int[i]

    # define the gamma regression model for the mean. use the log link the ensure positive, non-zero mu
    mu[i] <- pow(eta[i], -1)
    eta[i] <- beta0 + beta1*MPD[i] + beta2*MTD[i] + beta3*int[i] + beta4*MPD[i]*int[i] + beta5*MTD[i]*int[i]

    # redefine the mu and sd of the continuous part into the shape and scale parameters
    shape[i] <- pow(mu[i], 2) / pow(sd, 2)
    rate[i] <- mu[i] / pow(sd, 2)

    # for readability, define the log-likelihood of the gamma here
    logGamma[i] <- log(dgamma(y[i], shape[i], rate[i]))

    # define the total likelihood, where the likelihood is (1 - w) if y < 0.0001 (z = 0) or
    # the likelihood is w * gammalik if y >= 0.0001 (z = 1). So if z = 1, then the first bit must be
    # 0 and the second bit 1. Use 1 - z, which is 0 if y > 0.0001 and 1 if y < 0.0001
    logLik[i] <- (1 - z[i]) * log(1 - w[i]) + z[i] * ( log(w[i]) + logGamma[i] )

    Lik[i] <- exp(logLik[i])

    # Use the ones trick
    p[i] <- Lik[i] / C
    ones[i] ~ dbern(p[i])
  }

  # PRIORS
  beta0 ~ dnorm(0, 0.0001)
  beta1 ~ dnorm(0, 0.0001)
  beta2 ~ dnorm(0, 0.0001)
  beta3 ~ dnorm(0, 0.0001)
  beta4 ~ dnorm(0, 0.0001)
  beta5 ~ dnorm(0, 0.0001)

  gamma0 ~ dnorm(0, 0.0001)
  gamma1 ~ dnorm(0, 0.0001)
  gamma2 ~ dnorm(0, 0.0001)
  gamma3 ~ dnorm(0, 0.0001)
  gamma4 ~ dnorm(0, 0.0001)
  gamma5 ~ dnorm(0, 0.0001)

  sd ~ dgamma(2, 2)

}