JAGS Model Coding – Coding a JAGS Error Model for Dependent Variable with Increasing Variance

bayesianheteroscedasticityjagsregression

I am running a model in JAGS. I have a situation where y is a linear function of x, but the error in y increases as the magnitude of the value of y increases. x and y are strictly positive. In OLS this generates heteroscedasticity. To deal with this within JAGS I specified the likelihood as:

for (i in 1:N){
y[i] ~ dnorm(y.hat[i], tau / y[i])

y.hat[i] <- a + b*x[i]
}

Here, the uncertainty in dnorm is specified as tau / y[i], reflecting the fact that precision is lower when values of y are larger. However, this generates an error when using the run.jags function in the runjags package for R.

  RUNTIME ERROR:
Possible directed cycle involving some or all
of the following nodes:

Then it lists every single value of y. So, this is clearly not the way to specify what I want. Any advice on how to do this better would be much appreciated! NOTE: I am specfically avoiding log-log data transformations.

Best Answer

JAGS does not allow directed cycles (parameters being used to define themselves), so you can't use y to define the parents of y. That means that if y appears on the left hand of a distribution, then it can't also appear on the right (even if it is fully observed as in your case). Technically speaking, it is sufficient to use the following:

y[i] ~ dnorm(y.hat[i], tau / dummy_y[i])

You just then define dummy_y from y (in R) and include it as data for JAGS. This prevents JAGS from seeing the directed cycle ... which arguably does not fix the problem but does at least make the error go away.

However, a better solution is to define the functional relationship for tau based on x (or possibly y.hat) instead:

y[i] ~ dnorm(y.hat[i], effective_tau[i])
effective_tau[i] <- tau / x[i]

I would also think more carefully about this relationship - if x (or y.hat if you use that) is negative, then effective_tau becomes negative and you have a problem. Assuming this is a linear relationship on whatever the scale is x observed is also probably a strong assumption (although I have no idea about your application so can't really comment on this). Writing the effective_tau out like I have done above allows you to more easily define other relationships, e.g.:

log(effective_tau[i]) <- mean_logtau + tau_effect * x[i]

This multiple-GLM-in-1-model approach is one of the big advantages of MCMC (when used wisely).

Related Question