Solved – How should I model interactions between explanatory variables when one of them may have quadratic and cubic terms

hypothesis testinginteractionmixed modelnonlinearregression-strategies

I'm sincerely hoping that I have phrased this question in such a way that it can be definitively answered–if not, please let me know and I will try again! I should also I guess note that I will be using R for these analyses.

I have several measures of plant performance (Ys) that I suspect were influenced by four treatments I imposed–flower thinning (X1), fertilization (X2), leaf clipping (X3), and biased flower thinning (X4). For all possible Ys, N is at least 242, so my sample sizes were large. All plots were either subjected to thinning or not, but each plot was also subjected to one (and only one) of the other three treatments (or not–there were control plots also). The idea of this design was to test whether the other three treatments were capable of either "masking" or "enhancing" the effects of thinning. Thus, by design, the latter three treatments (X2-X4) couldn't interact with one another because they weren't crossed, but they can each interact with flower thinning–and they probably do.

My explicit hypotheses are that 1) flowering thinning will be significant
and that 2) the interaction terms, X1*X2, X1*X3, and X1*X4, between flower thinning and the other three treatments will also be significant. That is, flower thinning should matter, but the ways in which it matters should be altered significantly by what the other three treatments did.

I'd like to include all of this information in a mixed-model:

Y ~ X0 + X1 + X2 + X3 + X4 + X1*X2 + X1*X3 + X1*X4 + (Up to three random effects)

But there is one hang-up: I have good reason to believe the effects of thinning on Y are non-linear. They are probably quadratic but maybe even cubic in some cases. This is because the effects of thinning on performance are very likely to increase faster at higher levels of thinning. If I try to model this non-linear relationship via the equation above by adding quadratic and cubic terms for X1, I am then unsure how to model the interaction terms–am I supposed to include every possible combination of X1, (X1)^2, and (X1)^3 * X2, X3 and X4? Because that seems like a lot of parameters to try to estimate, even with the number of data points I have, and I'm unsure how to interpret the results I would get. That said, I have no biological reason to think this would be an imprudent way to model the situation.

So, I have three thoughts for how to address this issue:

  1. Fit a smaller model first, e.g. Y ~ X1 + X1^2 + X^3 + Random effects, with the sole aim of figuring out whether the relationship between thinning and Y is linear, quadratic, or cubic, and then transform thinning via a square- or cube-root to linearize the relationship appropriately. From there, the interaction terms can be modeled as above with the transformed variable.
  2. Assume that significant interactions, if they occur, affect only one of the X1 terms (i.e. only the linear, quadratic, or cubic term), and model the interactions accordingly. I'm not even sure if this approach makes sense.
  3. Just fit the "full model" with every possible interaction term between the thinning terms and the other treatments as discussed above. Then, prune out insignificant interaction terms and use graphs and other techniques to interpret the results.

Which of these approaches, if any, makes the most sense and why, given that I'm interested in hypothesis testing and not model selection? In particular, if #1 above doesn't make sense to do, why is that? I have read this article and this article and have tried to digest what they might mean for me, but any sources for further reading would also be much appreciated!

Best Answer

None of those approaches will work properly. Approach 3. came close, but then you said you would prune out insignificant terms. This is problematic because co-linearities make it impossible to find which terms to remove, and because this would give you the wrong degrees of freedom in hypothesis tests if you want to preserve type I error.

Depending on the effective sample size and signal:noise ratio in your problem I'd suggest fitting a model with all product and main effect terms, and interpreting the model using plots and "chunk tests" (multiple d.f. tests of related terms, i.e., a test for overall interaction, test for nonlinear interaction, test for overall effect including main effect + interaction, etc.). The R rms package makes this easy to do for standard univariate models and for longitudinal models when $Y$ is multivariate normal. Example:

# Fit a model with splines in x1 and x2 and tensor spline interaction surface
# for the two.  Model is additive and linear in x3.
# Note that splines typically fit better than ordinary polynomials
f <- ols(y ~ rcs(x1, 4) * rcs(x2, 4) + x3)
anova(f)   # get all meaningful hypothesis tests that can be inferred
           # from the model formula
bplot(Predict(f, x1, x2))    # show joint effects
plot(Predict(f, x1, x2=3))   # vary x1 and hold x2 constant

When you see the anova table you'll see lines labeled All Interactions which for the whole model tests the combined influence of all interaction terms. For an individual predictor this is only helpful when the predictor interacts with more than one variable. There is an option in the print method for anova.rms to show by each line in the table exactly which parameters are being tested against zero. All of this works with mixtures of categorical and continuous predictors.

If you want to use ordinary polynomials use pol instead of rcs.

Unfortunately I haven't implemented mixed effect models.

Related Question