Regression Strategies – Understanding Interaction with %ia% in rms and Three-Way Interactions

interactionregressionregression-strategiesrmssplines

In this very illustrative post on evaluating added value of predictors by Frank Harrell, he codes a logistic regression model as such:

lrm(sigdz ~ rcs(age,4) * sex + rcs(choleste,4) + rcs(age,4) %ia%
         rcs(choleste,4), data=acath)

The "%ia%" expression is new to me. He justifies its use:

The nonlinear interaction between age and cholesterol is a restricted
one such that terms that are nonlinear in both predictors are
excluded. This is to save degrees of freedom.

Although I do not quite understand what's behind the explanation, it does make sense to me that we can make a less-demanding interaction, since coding continuous-continuous interactions of spline variables does seem to make model heavier.

Q1: What does %ia% mean, in coding terms? Couldn't find much refs on the expression.

Q2: If I wanted to code a three-way interaction between two continuous predictors modelled with rcs() and a categorical one, how could I code in this df-sparing %ia% manner? (in the case of the post example, involving cholesterol, age and sex, for example)

Know coding Q's are better suited to SO, but this has a more stats background.

Thanks

Best Answer

This would be better posted in https://stackoverflow.com or even better at https://discourse.datamethods.org/t/rms-discussions but here goes:

Suppose you had two predictors $a, b$ that are modeled as quadratic effects using the rms regular polynomial function. The model would be specified as y ~ pol(a,2) * pol(b, 2). With full interactions you'd have these terms in the model: $a, b, a^{2}, b^{2}, ab, a^{2}b, ab^{2}, a^{2}b^{2}$. If you use the restricted interaction operator in the R rms package you'd drop $a^{2}b^{2}$, the doubly nonlinear term. %ia% does not extend to three-way interactions.

Related Question