I want to analyze fish aquaculture data comparing growth rates by state and controlling for tank effects as a mixed effect. I will use the equation
$$
w = aL^b
$$
but I want b
to be a function of state of origin. I can do this using the nlme
function in the nlme
package. Here is a toy data set similar to mine.
tl <- seq(100, 300, 1/5)
state <- c(rep(x = c("DE", "TX", "FL", "VA", "SC"), times = length(tl)/5), "DE")
ai <- 0.000004
bi <- 3.3
lw <- NULL
set.seed(1234)
for(i in 1:length(tl)) {
lw[i] <- log(ai) + bi * log(tl[i]) + rnorm(n = 1, 0, 0.1)
}
w <- exp(lw)
df <- data.frame(cbind(w, tl))
df$state <- as.factor(state)
df$tank <- as.factor(c(rep(1:15, length(tl)/15), 1:11))
This is the code I would use:
nl <- nlme(w ~ a * tl ^ b, fixed = list(a ~ 1, b ~ state),
random = b ~ 1|tank, data = df, start = c(a = rep(0.0000001, times = 1),
b = rep(3.1, times = 5)))
summary(nl)
However, a plot of the data reveals that the error is likely to be multiplicative on the arithmetic scale (variability in w
(weight) increases with increasing tl
(total length)):
plot(tl, exp(lw))
My current model using nlme is
$$
w = aL^b + \epsilon \quad \epsilon \sim N(0, \sigma^2)
$$
where b is a function of state. However, I want to make the equivalent log-linear model:
$$
w = aL^b\epsilon \quad \epsilon \sim N(0, \sigma^2)
$$
$$
\log(w) = \log(a) + b \cdot \log(L) + \log(\epsilon)
$$
If it weren't for the fact that I want b to be a function of state I could use the lme
function in nlme
:
lm1 <- lme(log(mass) ~ log(tl), data = df, random = ~ 1|tank)
summary(lm1)
How would I make the equivalent log-linear model to my nonlinear model except for the error distribution? Xiao et al. 2010 discuss this issue but it's for non-mixed models and for functions when a or b are a function of a factor.
http://www.esajournals.org/doi/pdf/10.1890/11-0538.1
The final part of the question would be whether I could use AIC to compare the two models or if I should just use the log-linear if the data clearly look to have that error structure?
Best Answer
This is actually pretty easy: in statistical terms, you just add an interaction between total length
tl
andstate
.Regenerating data (with some tiny cosmetic/efficiency tweaks: you don't need the
for
loop):Fitting the nonlinear mixed model:
Now as a linear mixed model: you don't have the intercept varying either by state or by tank in the model above: that might be inadvisable, but I've replicated it in the model below.
As for comparing logged with unlogged data: the naive comparison will be wrong, but you can do this:
The adjustment to the AIC for log-transformation should be $2 \sum_i \log(y_i)$: as explained by Weiss, the adjustment to the likelihood is $d(\log(y))/dy = 1/y$, so the adjustment to the AIC is $-2 \sum \log(L) = -2 \sum(\log(1/y)) = 2 \sum \log(y)$.
Comparing negative log-likelihoods: