Solved – reporting regression results with categorical variables and interactions

mixed modelmultilevel-analysisregressionreporting

Several sources recommend reporting regression coefficients in a table for every mixed-effects model. For continuous predictors that's fine because I only get one coefficient for that predictor. But what if I have categorical predictors with more than two levels?
For example

mod <- lmer(angle~temp*recipe + (1|replicate), data=cake)
summary(mod)

Fixed effects:
               Estimate Std. Error         df t value   
(Intercept)    2.379365   6.199942 262.454162   0.384  
temp           0.153714   0.029819 250.000012   5.155
recipeB       -3.649206   8.464773 250.000006  -0.431  
recipeC       -1.941270   8.464773 250.000006  -0.229  
temp:recipeB   0.010857   0.042170 250.000006   0.257   
temp:recipeC   0.002095   0.042170 250.000006   0.050

How should I report this information? Should I report the coefficients for temp, recipeB and recipeC and their corresponding std. errors? Recipe has 3 levels (A,B,C) and mod used A as reference. Should I also change the reference category in recipe (eg to B) and re-run the model?
What about the interactions? Does it make sense to include all of the different combinations in the table?

EDIT: Model with two categorical variables, three levels each

cake2 <- cake[which(cake$temp < 200),]
mod <- lmer(angle~temperature*recipe + (1|replicate), data=cake2)
summary(mod)

Fixed effects:
                       Estimate Std. Error        df t value    
(Intercept)            29.33333    1.67210  17.27956  17.543 
temperature.L           3.44125    1.12498 112.00000   3.059 
temperature.Q          -0.08165    1.12498 112.00000  -0.073  
recipeA                 1.15556    0.91854 112.00000   1.258     
recipeC                 0.20000    0.91854 112.00000   0.218  
temperature.L:recipeA  -2.26274    1.59096 112.00000  -1.422   
temperature.Q:recipeA  -1.19753    1.59096 112.00000  -0.753     
temperature.L:recipeC  -0.75425    1.59096 112.00000  -0.474    
temperature.Q:recipeC   0.81650    1.59096 112.00000   0.513

Here I'm not sure what is irrelevant and can be be left out of the table. For example the way the model is set up, the reference is always temperature=175 and recipe=B. I think I should also report the interaction effects using other references right? Or will readers still be able to calculate the values of the other effects only using the values from the table above?

Best Answer

Regardless of where, why, & to whom you're reporting results, some general considerations are likely to apply.

In general tabulating some coefficient estimates but not others may well cause confusion about what model you've in fact fitted; & in particular reporting coefficient estimates for "main effects" but not for the interactions in which they participate is not very informative. In this case if you excluded interactions from the table we'd learn about the effect of temperature only for Recipe A, the reference level (from the temp coefficient). The interactions (temp:recipeB, temp:recipeC) show the effect of temperature for the other two recipes, so there's no point in not showing them.

However, reporting on various models that differ only in how they're parametrized is likely over-kill. (Just two three-level categorical predictors, & you've got nine different ways to specify the reference levels.) Readers can easily enough calculate estimates & standard errors for any contrasts they might be interested in that aren't explicitly tabulated. For example, the recipeB coefficient represents the effect of changing from Recipe A to Recipe B at 0°F, while recipeC represents the effect of changing from Recipe A to Recipe C at 0°F. If anyone's wondering about the effect of changing from Recipe B to Recipe C at 0°F, its the difference between recipeC & recipeB; there's no need to re-fit the model using Recipe B as the reference level.

Related Solutions

Simple Linear Regression Reporting – What Information to Include

For a simple linear regression, I would always produce a plot of the x variable against the y variable, with the regression line super-imposed on the plot (always plot your data whenever its feasible!). This will tell you very easily how well your model fits, and is easy to read for 1 variable regression. Adding that to what you've already got would probably be sufficient, although you may want to include some diagnostic plots (leverage, cooks distance, residuals, etc.). It depends on how good that x-y plot is, and on your intended audience, and any protocols that your audience expect.

$R^2$ vs RMSE

$R^2$ is a relative measure, whereas the RMSE is more of an absolute measure, as you would expect most observations to be within $\pm$RMSE from the fitted line, and nearly all to be within $\pm 2$RMSE. If you want to convey "explanatory power" $R^2$ is probably better, and if you want to convey "predictive power", the RMSE is probably better.

Solved – Reporting results of linear mixed-effects model

This may not help answer your question, but I noticed that you have a repeated measure (Day) in your experiment, but you did not indicate that this was a repeated measure in your model. I would have thought the random term in your model to be as such:

mymodel <- lme(dv ~ Treatment*Day, random = ~1|Subject/Day, 
               data = mydf, na.action = na.omit,
               correlation = corAR1(form = ~1 |Subject/Day), method = "REML")

As for reporting the results, did you intend to report on the day on which you start seeing significant differences between the treatments? If so, then I think you'll need to look at/report on the contrasts on the interaction term as well. I'm a stats novice myself and basically have the same question as you do :-)

Andy Field's "Discovering Statistics Using R" explains how to report results from a linear mixed effects model in Ch14. I don't have the book at hand but can edit this post once I get my hands on it again.

Best Answer

Related Solutions

Simple Linear Regression Reporting – What Information to Include

Solved – Reporting results of linear mixed-effects model

Related Question