I am running a multiple linear regression with interactions between the three independent variables (P, R and L). Y is a continuous variable and P, R and L are discrete variables (in particular they represent 3 different type of payments: P is payment for doing P, with levels (1, 2, 3, 4, 6, 10), R is payment for doing R, with levels (1, 3, 5, 7), and L is penalty for doing L, with levels (1, 2, 4, 6).
By forming P, R, L into a payoff matrix with opposite incentives, I obviously expected to find an interaction, given the opposite nature of the payoffs. However, I am having trouble interpreting these results.
-
The value of the intercept (which is the mean of the response when all explanatory variables take the value 0) is negative. I understand why this occurs (the intercept is nothing more than the value at which the regression line crosses the y-axis). This value is meaningless in real life since the dependent variable takes values greater than 0 in the real experiment. I think this is fine, since it is just a consequence of the extension of the line. But should I worry about this when reporting the results?
-
I am struggling to find advice on how to interpret the effect sizes (slopes) of the variables and the interactions of this three-way interaction. Can anyone help me interpret these results?
-
Finally, would it be advisable to standardize the values of discrete variables? I have not found reliable information on whether it is good practice to standardise numerical discrete variables of the type I have described.
Here is the summary of the model.
Call:
lm(formula = Y ~ P * R * L, data = df_z, REML = FALSE)
Residuals:
Min 1Q Median 3Q Max
-0.55253 -0.07997 -0.02553 0.04271 0.82344
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.642147 0.125082 -5.134 3.21e-07 ***
P 0.575138 0.037698 15.257 < 2e-16 ***
R 0.177976 0.015065 11.813 < 2e-16 ***
L 1.881339 0.124049 15.166 < 2e-16 ***
P:R -0.059157 0.004531 -13.057 < 2e-16 ***
P:L -0.582082 0.037636 -15.466 < 2e-16 ***
R:L -0.178618 0.012367 -14.443 < 2e-16 ***
P:R:L 0.057509 0.003758 15.305 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.1768 on 1492 degrees of freedom
Multiple R-squared: 0.6261, Adjusted R-squared: 0.6243
F-statistic: 356.9 on 7 and 1492 DF, p-value: < 2.2e-16
Best Answer
This is the standard problem with interpreting coefficients for predictors involved in interactions under usual predictor coding. You just generalize your correct understanding of the intercept. With predictors like yours (reference values of 0), the intercept is the outcome when all predictors have values of 0.
For a single predictor involved in an interaction, the coefficient represents its extra association over the intercept with outcome when all of its interacting predictors have values of 0. That deals with the individual coefficients for $P$, $R$ and $L$.
The same principle applies at the next level of two-way interactions. Each two-way interaction represents the extra association with outcome beyond what you would predict based on the 2 individual predictors involved, when all other predictors interacting with them are at levels of 0.
The 3-way interaction represents the further association with outcome when none of the interacting predictors has a value of 0.
If you recognize that the interactions are simply products of predictors, you can see this by just writing out all the coefficients and interaction terms explicitly:
$$Y = \beta_0 + \beta_P P + \beta_R R + \beta_L L + \beta_{PR} PR + \beta_{PL} PL +\beta_{RL} RL + \beta_{PRL} PRL.$$
Note which terms are left when different combinations of $P$, $R$ and $L$ have values of 0.
Your difficulty with interpreting the coefficients arises in part from the fact that none of your predictors ever takes on a value of 0. You might find the coefficients easier to interpret if you just subtracted 1 from all of your predictor values. That way the intercept would represent a realistic possibility and the interactions might be easier to think about. That said, any predictions made from the model will be the same regardless.
Finally, note that you have implicitly assumed that each of your "discrete" predictors is behaving as if it is continuous, with each unit increase having the same effect on outcome regardless of where it is along its scale. That's something to consider beyond how to interpret the interaction terms. As you are effectively assuming the "discrete" predictors to be continuous you certainly could standardize them as if they were continuous, but I don't see that would improve interpretability at all and, again, predictions from the model would end up the same.