Solved – Correlation structure of nested nonlinear mixed model

lme4-nlmenested datar

I would appreciate any help. I am trying to fit a nonlinear mixed model to some experiments. I have adopted a Gompertz function (y=Asym*exp(-b2*b3^x)), and the data used is 3 years of experiments (the doses were not always the same and I removed some outliers before any analysis). Moreover, since the measurements were destructive, it is not a repeated measure case. I am fitting a nested model in R (nlme package, tried with lme4 as well). The data are here. The final model adopted by now is:

model <- nlme(y ~ SSgompertz(x,Asym,b2,b3),
              data = Data,
              fixed = Asym + b2 + b3 ~ 1,
              random = Asym ~ 1|Experiment/Treatment,
              start = c(Asym = 5, b2 = 30, b3 = 0.8), 
              verbose = T, na.action = na.omit, method="ML",
              control = nlmeControl(maxIter=100000, opt="nlminb", pnlsTol=0.01))

In relation to diagnosis plots:

There does not seem to be a normality problem.
There isn't a problem with autocorrelation. However, if I can include an autocorrelation structure between groups, I think the model can be improved.
There is increasing variability. So including a heterogeneous variance for the error term will improve the model and will help to deal with the heteroscedasticity.

The summary of the model gives a high negative correlation between b2 and b3 parameters. I am not sure how bad is this.

So my questions is: How can I correctly specify a correlation structure? I have been trying different ways and methods for the correlation function without success. On the other hand, the inclusion of weights and more random effects did not improve the model. I was planning to include a covariate but this option has not been implemented for nested models in nlme. So, in addition to specifying a correlation structure, how could I improve the model's results?

Best Answer

Disclaimer: In the following I'll assume that your random structure is correct. I cannot judge that without more information and it would be beyond the scope of one question anyway.

Let's plot your data:

library(ggplot2)
ggplot(Data, aes(x = x, y = y, color = factor(Experiment))) +
  geom_point() +
  facet_wrap(~ Treatmen)

What we see there is increasing scatter with increasing x-values. That's what you see as heterogeneity from your model too. You don't need a correlation structure, but need to consider the structure of the variance. You could add a variance structure to your model, but then fitting the model becomes difficult and computationally expensive. I couldn't manage a successful and reasonable fit.

The obvious alternative would be fitting a model on the log scale. However, there is a problem with your data:

summary(Data$y)
#    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
# 0.00000  0.06545  0.99720  2.12000  3.59700 10.39000

You have exact zero values in your data. A Gompertz function does not give exact zeros. That's the limit for x going to minus infinity. Since these values were probably measured, I assume they were below the detection limit. That's not the same as zero. Thus, I'd exclude them from analysis.

Data2 <- Data[Data$y > 0,]
ggplot(Data2, aes(x = x, y = log(y), color = factor(Experiment))) +
  geom_point() +
  facet_wrap(~ Treatmen)

That looks better.

Now, the Gompertz function is:

$y = A \cdot \exp{(-\beta_2 \cdot \beta_3^x)}$

On the log-scale:

$\ln{y} = \alpha - \beta_2 \cdot \beta_3^x$

Let's fit that (you need to adjust the starting values, the estimates from the original model are helpful here):

mod2 <- nlme(log(y) ~ a - b2 * b3^x,
             data = Data2,
             fixed = a + b2 + b3 ~ 1,
             random = a ~ 1|Experiment/Treatmen,
             start = c(a = log(5), b2 = 15, b3 = 0.995), 
             verbose = T, na.action = na.omit, method="ML",
             control = nlmeControl(maxIter=100, opt="nlminb", pnlsTol=0.01))

plot(mod2)

Back-transform coefficient for asymptote:

coefs <- coef(mod2)
coefs$Asym <- exp(coefs$a)
#            a     b2        b3     Asym
#1/1 1.4106169 15.722 0.9928132 4.098483
#1/2 1.6317404 15.722 0.9928132 5.112765
#1/4 1.7736012 15.722 0.9928132 5.892034
#1/6 1.6848617 15.722 0.9928132 5.391705
#1/8 1.8593160 15.722 0.9928132 6.419344
#2/1 1.1390170 15.722 0.9928132 3.123696
#2/3 1.3801606 15.722 0.9928132 3.975540
#2/5 1.6804342 15.722 0.9928132 5.367886
#2/9 1.7479048 15.722 0.9928132 5.742558
#3/1 0.8431604 15.722 0.9928132 2.323699
#3/3 1.7521368 15.722 0.9928132 5.766912
#3/5 1.5909919 15.722 0.9928132 4.908616
#3/7 1.6985093 15.722 0.9928132 5.465793
#3/9 1.6921572 15.722 0.9928132 5.431184

Data$pred <- exp(predict(mod2, newdata = Data))

ggplot(Data, aes(x = x, y = y, color = factor(Experiment))) +
  geom_point() +
  geom_line(aes(y = pred)) +
  facet_wrap(~ Treatmen)

Regarding the high correlation between parameters: There is nothing you can do about that other than using a model with less parameters, getting more data or determining parameters independently in additional experiments. Sometimes re-parameterization of the model can help too.

Related Solutions

Solved – Specify correlation structure for different groups in mixed-effects model (lme4/nlme)

An alternative very flexible approach is Bayesian. You can implement it in R using JAGS (you will have to go through some steps to download beyond just a package). If you do this, you can specify any correlation structure you want.

To structure it this way, you could either 1) treat your spatially correlated outcomes as part of a multivariate normal model (now y has 2 dimensions, the outcome and the space). or 2) Add another random component for space to the model which has its own correlation structure.

For example, you could modify this code for (2) and build a random intercept and slope model in R. Each person is indexed by i (say a n x 1 vector) and you also provide another nx1 vector of their spatial index (called site_indicator). You also need to provide the total number of sites.

model_RIAS_MVN<-"
model{
#Likelihood
for(i in 1:N_tot) {# all obs
# outcome is normally distributed
bodymass ~ dnorm(mu, sigmainverse)

# outcome model
mu <- b[1] + RI[subj_num[i], 1] + b[2]*Age[i] + RI[subj_num[i], 2]*Age[i] + RandomSpace[site_indicator[i]]
}

# Prior for random intercepts and slopes 
# this allows them to be correlated 
for (j in 1:N_people) {
RI[j, 1:2] ~ dmnorm(meanU[1:2], G.inv[1:2, 1:2])
}

# CHANGE HERE FOR NEW SITE CORRELATION #
# change number_sites in meanspace and G.invspace to actual number bc it # could throw error
for (j in 1:number_sites) {
RandomSpace[j] ~ dmnorm(meanspace[1:number_sites], G.invspace[1:number_sites, 1:number_sites])
}

for(i in 1:2){
meanU[i] <- 0 # zero mean for random components
}

G.inv[1:2, 1:2] ~ dwish(G0[1:2, 1:2], Gdf)
G[1:2, 1:2] <- inverse(G.inv[1:2, 1:2])

# whatever structure you want for correlation
G.invspace[1:number_sites, 1:number_sites] ~ dwish(G0inv[1:number_sites, 1:number_sites], Gdf_space)
Gspace[1:number_sites, 1:number_sites] <- inverse(G.invspace[1:number_sites, 1:number_sites])

sigmainverse ~ dgamma(1,1)

# informative priors for fixed effects
b[1] ~ dnorm(20, 0.25)          
b[2] ~ dnorm(1, 4)    

# uncomment for uninformative priors
# b[1] ~ dnorm(0, 0.01)          
# b[2] ~ dnorm(0, 0.01)              
}

This is just working code and it will need tweaked, but hopefully you get an idea of the flexibility and how you can specify the correlation structure this way.

Solved – Mixed effects model with random and nested effects in lmer

Please read the R GLMM FAQ: http://glmm.wikidot.com/faq especially the section on "should I treat factor xxx as fixed or random?" or the email thread: https://stat.ethz.ch/pipermail/r-sig-mixed-models/2010q2/003710.html

If I understand your data correctly, you have only 2 species, so estimating a random variance for species is problematic at best. You have 6 total accessions (3+3), which is about the minimum for 1 variance component.

Best Answer

Related Solutions

Solved – Specify correlation structure for different groups in mixed-effects model (lme4/nlme)

Solved – Mixed effects model with random and nested effects in lmer

Related Question