Solved – Calculating Estimated Marginal Means from univariate data

lsmeansr

I am trying to calculate the estimated marginal means (aka least squared means) in R in order to do statistical analysis for a univariate dataset and am struggling as all the examples are from multivariate datasets.

My design is count data (wallaby scats from fixed quadrats) with repeated measures (samples taken once a year over three years). I am interested in the mean changes of scat counts over the three years. A reduced sample of my data looke like:

enter image description here

My mixed effects model looks like:

scatcount ~ year + (1|plot)

the random effect of plot is included to account for the repeated measures.

I was advised to calculate the estimated marginal means and am using the "emmeans" package in R. However I am struggling to get the code and process right for this. All the examples I can find are comparing two or more fixed and/or random variables so I'm struggling to apply it to my data. I am also unsure how to include the random effect of plot in the EMM calculation, or if I even need to? Any suggestions on how to do this in R would be much appreciate!

Best Answer

You need to do something like this:

library(lme4)
model <- lmer(scatcount ~ year + (1|plot), data = scatdata)

library(emmeans)
emmeans(model, "year")  # display the EMMs
pairs(.Last.value)      # pairwise comparisons

I'll consider adding a simple one-factor example somewhere in the documentation.

Note: The model is what accounts for the random effect. emmeans just summarizes results from a model; so if the model accounts properly for the random effect(s), you don't need to do anything extra in emmeans.

PS -- For that particular response variable, I'm guessing you will have pretty heterogeneous errors (evidenced, e.g., by a horn-shaped scatterplot of residuals versus fitted values). You probably should consider either a transformed response (e.g., sqrt(scatcount) ~ ...) or a generalized linear mixed model for a Poisson response (using, e.g., lme4::glmer()). Such models are also supported by the emmeans package but you may want to do additional things to obtain results back on the scatcount scale; see the vignette on transformations and link functions.