Solved – Mixed effects model for power function data

lme4-nlmemixed modelpower lawr

I have data which I suspect follows a power function over time. It is collected from several units which have different intercepts. Therefore I'd like to do a mixed model with the parameters of the power function as fixed effects and intercept as random effect. Can I do this with lme4::nlmer? If not, can I do it in another way?

To illustrate with a reproducible example, say I have this data, D:

x = 1:100
y1 = 7 + 2 * x^0.4 + rnorm(100); plot(y1, main='y1')  # unit 1
y2 = 4 + 2 * x^0.4 + rnorm(100); plot(y2, main='y2')  # unit 2
y3 = 1 + 2 * x^0.4 + rnorm(100); plot(y3, main='y3')  # unit 3
D = data.frame(y=c(y1, y2, y3), id=rep(c(1,2,3), each=100), time=rep(x, 3))
plot(y ~ time, D, main='all')  # combined data

enter image description here

I can model this as a power function with the nls function with a common intercept:

nls(y ~ k + a * time ^ b, D, start=c(a=1, b=1, k=5))

This gives me estimates like a=1.3 (it was 2), b=0.47 (it was 0.4) and k=5.4 (it was 1, 4 and 7). I'd like to do something like this:

nlmer(y ~ k + a * time ^ b + (1|id), D)  # doesn't work of course

I'm going to add more random effects later. This is just a minimal reproducible example.

Best Answer

(Disclaimer: answering my own question after fiddling around - I'm really not an expert on this so please read critically)

Step 1: create a power function, including the intercept, using the deriv function which adds the necessary properties to the object. You have to be somewhat verbose, specifying which parameters to estimate (namevec) and which parameters nlmer can play with (namevec). In this case it's sort of trivial but this is handy if you have big and complicated stuff going on with lots of internal variables that really are of no interest to nlmer when finding the optimal fit. So:

power.f = deriv(~k + a*time^b, namevec=c('k', 'a', 'b'), function.arg=c('time','k', 'a','b'))

Step 2: fit the parameters of the nonlinear model using a dependent ~ non-linear ~ fixed + random syntax, where the non-linear part is objects of the sort we just created above.

fit.nlmer = nlmer(y ~ power.f(time, k, a, b) ~ k|id, start=list(nlpars=c(k=2, a=1, b=1)), data=D)

A few comments on step 2: The latter part is the stuff you may be used to from lmer but with the exception that it won't accept intercept-only stuff (for example 1|id). So here's why you didn't just make the power.f formula ~ a*time^b and put a random intercept in the fixed-random part. Instead you "put in" a random intercept into the nonlinear function - which in this case is equivalent to a linear intercept.

Also: the start values are just you helping nlmer to start in the vicinity of the correct solution since the likelihood landscape can be way more complex and contain more traps (plateaus, local minima etc.) than a linear one. But don't care too much about being spot-on (after all, you're doing inference for a reason). I just looked at the data and put in something that was not totally off.

To see why this mixed-model effort is fruitful compared to a common-intercept model like the nls one, consider the fit (observed y versus model-predicted y):

enter image description here