Solved – Mixed models (lme4) + lsmeans to estimate trend in population means

lsmeansmixed model

I aim to estimate one populations mean blood pressure during different years. Here is the setting:
(1) 7400 onservations; repeated measurements. Unbalanced.
(2) measurements are undertaken anually from 2001 to 2012. Eachindividual maybe measured several times.
(3) one can expect a time-bias, i.e as time progress, there will be made advances in therapy, which will lead to more efficient blood pressure lowering. These factors will not be taken into account. Deliberately.

Aim: estimate population mean blood pressure each year. Including confidence interval.

Methods: use lme4 package (linear outcome, blood pressure) to account for repeated measurements and lsmeans package to estimate the population mean each year.

Code:

fit <- lmer(bloodpressure ~ year + age + sex + (1 | patient_id), data=data)

'year' is the factor variable for which I'd like to obtsin the means.

lsmeans(fit, ~ year) # not tried

Questions
– is this method OK?
– will the covariance be respected by using lsmeans, or should I use lmerTest package which has a built in function for estimating lsmeans?
– each individual has a random slope in the call above, should I adjust it to include random effects for year also?

Thanks for any advice on this

Best Answer

The lsmeans package does produce the correct results with model objects from a number of packages, including lme4. If you have a fairly recent update of lsmeans installed, you can do ? models and see information on what model objects are supported, and details of any special provisions.

Related Solutions

Solved – Obtaining adjusted (predicted) proportions with lme4 – using the glmer-function

predict() in lme4 does not work well unless the grouping factor specification is "realistic". If we use samples from the observed data, we get reasonable predictions. I think this is a bug in predict.merMod()

This is lme4 1.1-7

my.fit <- glmer(smoker ~ biomarker + year + sex + age + (1|id), data = df, family = binomial(link='logit'), nAGQ = 0)
## predict with the observed data
predictions <- predict(my.fit, type = "response")
mean(predictions, na.rm = TRUE)
[1] 0.1144976
## predict for year 1996 to year 2014
new.df <- df[sample(nrow(df), replace = TRUE), ]
sapply(1996:2014, function(x) {
  new.df$year <- x
  predictions <- predict(my.fit, newdata = new.df, type = "response", allow.new.levels=TRUE)
  mean(predictions, na.rm = TRUE)
})
[1] 0.15785350 0.15374013 0.14974742 0.14586612 0.14208791 0.13840528 0.13481145 0.13130033
[9] 0.12786645 0.12450488 0.12121121 0.11798144 0.11481198 0.11169959 0.10864132 0.10563448
[17] 0.10267662 0.09976548 0.09689897

Compare that with what you get if you set id to a new value, not present in df when my.fit was evaluated.

new.df$id <- 1
    sapply(1996:2014, function(x) {
      new.df$year <- x
  predictions <- predict(my.fit, newdata = new.df, type = "response", allow.new.levels=TRUE)
  mean(predictions, na.rm = TRUE)
})
[1] 0.07292960 0.06660550 0.06079003 0.05544891 0.05054904 0.04605868 0.04194759 0.03818706
[9] 0.03475000 0.03161093 0.02874596 0.02613281 0.02375069 0.02158030 0.01960378 0.01780458
[17] 0.01616745 0.01467832 0.01332425

If you want confidence intervals around the predicted means, I can recommend the boot package

my.bootstrap.predictions.f <- function(data, indices){
  return(mean(predict(my.fit, newdata = data[indices, ], type = "response", allow.new.levels=TRUE), na.rm=TRUE))
}
## predict for year 1996 to year 2014
new.df <- df[sample(nrow(df), replace = TRUE), ]
time.period <- 1996:2014
my.results <- matrix(nrow=length(time.period), ncol = 4)
for(x in 1:length(time.period)){
  my.results[x, 1] <- time.period[x]
  new.df$year <- time.period[x]
  ## bootstrap using a realistic number of samples per year, say 20000
  my.boot.obj <- boot(data = new.df[sample(nrow(new.df), 20000, replace = TRUE), ], statistic = my.bootstrap.predictions.f, R = 100)
  my.results[x, 2] <- my.boot.obj[[1]]
  my.results[x, 3:4] <- quantile(my.boot.obj[[2]], c(0.025, 0.975))
}
colnames(my.results) <- c("Year", "mean proportion", "lower.ci", "upper.ci")

Using R = 2 in the call to boot, I got the following results:

my.results
      Year mean proportion   lower.ci   upper.ci
 [1,] 1996      0.15546203 0.15074711 0.15913858
 [2,] 1997      0.15259172 0.15187967 0.15367255
 [3,] 1998      0.14850053 0.14672908 0.14758575
 [4,] 1999      0.14304792 0.14076788 0.14285565
 [5,] 2000      0.14340505 0.14308830 0.14354602
 [6,] 2001      0.13575446 0.13580401 0.13906530
 [7,] 2002      0.13378345 0.13092585 0.13734706
 [8,] 2003      0.13135301 0.13367143 0.13379709
 [9,] 2004      0.12884709 0.12817521 0.12889604
[10,] 2005      0.12407854 0.12164363 0.12338582
[11,] 2006      0.11703560 0.11692974 0.12129822
[12,] 2007      0.11701868 0.11660557 0.12143424
[13,] 2008      0.11427061 0.11247954 0.11578390
[14,] 2009      0.10743174 0.10681350 0.11119430
[15,] 2010      0.10769067 0.10547794 0.10833410
[16,] 2011      0.10677037 0.10734999 0.10867513
[17,] 2012      0.10064624 0.09991072 0.10205153
[18,] 2013      0.09735750 0.09704462 0.10003968
[19,] 2014      0.09448228 0.09331708 0.09346973

Solved – Post-hoc testing in multcomp::glht for mixed-effects models (lme4) with interactions

It is a lot easier to do using the lsmeans package

library(lsmeans)
lsmeans(mod, pairwise ~ tension | wool)
lsmeans(mod, pairwise ~ wool | tension)

Best Answer

Related Solutions

Solved – Obtaining adjusted (predicted) proportions with lme4 – using the glmer-function

Solved – Post-hoc testing in multcomp::glht for mixed-effects models (lme4) with interactions

Related Question