Piecewise Linear Model – Estimating Break Point in R with Random Effects

change pointlme4-nlmemixed modelpiecewise linearr

Can someone please tell me how to have R estimate the break point in a piecewise linear model (as a fixed or random parameter), when I also need to estimate other random effects?

I've included a toy example below that fits a hockey stick / broken stick regression with random slope variances and a random y-intercept variance for a break point of 4. I want to estimate the break point instead of specifying it. It could be a random effect (preferable) or a fixed effect.

library(lme4)
str(sleepstudy)

#Basis functions
bp = 4
b1 <- function(x, bp) ifelse(x < bp, bp - x, 0)
b2 <- function(x, bp) ifelse(x < bp, 0, x - bp)

#Mixed effects model with break point = 4
(mod <- lmer(Reaction ~ b1(Days, bp) + b2(Days, bp) + (b1(Days, bp) + b2(Days, bp) | Subject), data = sleepstudy))

#Plot with break point = 4
xyplot(
        Reaction ~ Days | Subject, sleepstudy, aspect = "xy",
        layout = c(6,3), type = c("g", "p", "r"),
        xlab = "Days of sleep deprivation",
        ylab = "Average reaction time (ms)",
        panel = function(x,y) {
        panel.points(x,y)
        panel.lmline(x,y)
        pred <- predict(lm(y ~ b1(x, bp) + b2(x, bp)), newdata = data.frame(x = 0:9))
            panel.lines(0:9, pred, lwd=1, lty=2, col="red")
        }
    )

Output:

Linear mixed model fit by REML 
Formula: Reaction ~ b1(Days, bp) + b2(Days, bp) + (b1(Days, bp) + b2(Days, bp) | Subject) 
   Data: sleepstudy 
  AIC  BIC logLik deviance REMLdev
 1751 1783 -865.6     1744    1731
Random effects:
 Groups   Name         Variance Std.Dev. Corr          
 Subject  (Intercept)  1709.489 41.3460                
          b1(Days, bp)   90.238  9.4994  -0.797        
          b2(Days, bp)   59.348  7.7038   0.118 -0.008 
 Residual               563.030 23.7283                
Number of obs: 180, groups: Subject, 18

Fixed effects:
             Estimate Std. Error t value
(Intercept)   289.725     10.350  27.994
b1(Days, bp)   -8.781      2.721  -3.227
b2(Days, bp)   11.710      2.184   5.362

Correlation of Fixed Effects:
            (Intr) b1(D,b
b1(Days,bp) -0.761       
b2(Days,bp) -0.054  0.181

Broken stick regression fit to each individual

Best Answer

Another approach would be to wrap the call to lmer in a function that is passed the breakpoint as a parameter, then minimize the deviance of the fitted model conditional upon the breakpoint using optimize. This maximizes the profile log likelihood for the breakpoint, and, in general (i.e., not just for this problem) if the function interior to the wrapper (lmer in this case) finds maximum likelihood estimates conditional upon the parameter passed to it, the whole procedure finds the joint maximum likelihood estimates for all the parameters.

library(lme4)
str(sleepstudy)

#Basis functions
bp = 4
b1 <- function(x, bp) ifelse(x < bp, bp - x, 0)
b2 <- function(x, bp) ifelse(x < bp, 0, x - bp)

#Wrapper for Mixed effects model with variable break point
foo <- function(bp)
{
  mod <- lmer(Reaction ~ b1(Days, bp) + b2(Days, bp) + (b1(Days, bp) + b2(Days, bp) | Subject), data = sleepstudy)
  deviance(mod)
}

search.range <- c(min(sleepstudy$Days)+0.5,max(sleepstudy$Days)-0.5)
foo.opt <- optimize(foo, interval = search.range)
bp <- foo.opt$minimum
bp
[1] 6.071932
mod <- lmer(Reaction ~ b1(Days, bp) + b2(Days, bp) + (b1(Days, bp) + b2(Days, bp) | Subject), data = sleepstudy)

To get a confidence interval for the breakpoint, you could use the profile likelihood. Add, e.g., qchisq(0.95,1) to the minimum deviance (for a 95% confidence interval) then search for points where foo(x) is equal to the calculated value:

foo.root <- function(bp, tgt)
{
  foo(bp) - tgt
}
tgt <- foo.opt$objective + qchisq(0.95,1)
lb95 <- uniroot(foo.root, lower=search.range[1], upper=bp, tgt=tgt)
ub95 <- uniroot(foo.root, lower=bp, upper=search.range[2], tgt=tgt)
lb95$root
[1] 5.754051
ub95$root
[1] 6.923529

Somewhat asymmetric, but not bad precision for this toy problem. An alternative would be to bootstrap the estimation procedure, if you have enough data to make the bootstrap reliable.