Solved – Interaction term in multivariate polynomial regression

multiple regression

I'm looking for answer for the question about multivariate polynomial regression.
I can't find a clear explanation of when an interaction term is necessary.

Some sources say that the estimated model of a complete second degree polynomial regression model in two variables $x_{1}$, $x_{2}$ may be expressed as

$$\hat{y} = b_{0} + b_{1}x_{1} + b_{2}x_{2} + b_{3}x_{1}^{2} + b_{4}x_{2}^{2} + + b_{5}x_{1}x_{2}$$

and others don't consider interaction term $x_{1}x_{2}$…

And how does it look when I have for example $d=4$ and $p=3$, a third degree polynomial of four independent variables?

Best Answer

Following the guidelines of Blackwell ("all models are wrong, but some are useful "), you have no obligation of including an interaction term, since:

  1. you may not afford including an interaction term due to a small number of observations in your dataset;
  2. you may have too much noise in your phenomenon so that including interaction terms can be futile, since you seldom will get them statistically significant; and
  3. higher degree interaction terms (like $x_1^3 x_5^2 x_9^5$, for instance) will very, very, very rarely have any meaningful interpretation in terms of your real phenomenon (and it only gets worse if your $y$ is multivariate).

OK, in strict mathematical speak, you can (perhaps ought to) include them, but considering strictly modeling issues, most of time it will not make sense for your first approach to a phenomenon, albeit it can and will make more sense in the course of a series of improvements of an initial model, like, for instance, in the regression of the concentration of one product in function of the concentration of several reagents and catalysts in a very complex chemical reaction.

But please notice that all of it will make sense only if your experiments are extremely well controlled and if you have a horribly awful lot of observations, both conditions more compatible with a rather long series of experiments in a well consolidated research, not in a very first attempt to get to understand, say, a social phenomenon.

And, last but not least, as asked, $$ y = \sum_{i_1=0}^{3} \sum_{i_2=0}^{3-i_1} \sum_{i_3=0}^{3-i_1-i_2} \sum_{i_4=0}^{3-i_1-i_2-i_3} \beta_{i_1i_2i_3i_4} x_1^{i_1}x_2^{i_2}x_3^{i_3}x_4^{i_4} +\varepsilon, $$ which, according to my calculations, will be a hell of a long expression. =)

Something like $$ y = \beta_{0000} + \beta_{1000} x_1+ \beta_{0100} x_2+ \beta_{0010} x_3+ \beta_{0001} x_4+ \beta_{2000} x_1^2+ \beta_{0200} x_2^2+ \beta_{0020} x_3^2+ \beta_{0002} x_4^2+ \beta_{1100} x_1x_2+ \beta_{1010} x_1x_3+ \beta_{1001} x_1x_4+ \beta_{0110} x_2x_3+ \beta_{0101} x_2x_4+ \beta_{0011} x_3x_4+ \beta_{3000} x_1^3+ \beta_{0300} x_2^3+ \beta_{0030} x_3^3+ \beta_{0003} x_4^3+ \beta_{1110} x_1x_2x_3+ \beta_{1101} x_1x_2x_4+ \beta_{0111} x_2x_3x_4+ \ldots+ \varepsilon. $$


Edit: I took the time to write a lazy script in R to generate an expression in LaTeX with all the terms of the sum above, which output this: $$ y = \beta_{0000} + \beta_{0001} x_4 + \beta_{0002} x_4^2 + \beta_{0003} x_4^3 + \beta_{0010} x_3 + \beta_{0011} x_3 x_4 + \beta_{0012} x_3 x_4^2 + \beta_{0020} x_3^2 + \beta_{0021} x_3^2 x_4 + \beta_{0030} x_3^3 + \beta_{0100} x_2 + \beta_{0101} x_2 x_4 + \beta_{0102} x_2 x_4^2 + \beta_{0110} x_2 x_3 + \beta_{0111} x_2 x_3 x_4 + \beta_{0120} x_2 x_3^2 + \beta_{0200} x_2^2 + \beta_{0201} x_2^2 x_4 + \beta_{0210} x_2^2 x_3 + \beta_{0300} x_2^3 + \beta_{1000} x_1 + \beta_{1001} x_1 x_4 + \beta_{1002} x_1 x_4^2 + \beta_{1010} x_1 x_3 + \beta_{1011} x_1 x_3 x_4 + \beta_{1020} x_1 x_3^2 + \beta_{1100} x_1 x_2 + \beta_{1101} x_1 x_2 x_4 + \beta_{1110} x_1 x_2 x_3 + \beta_{1200} x_1 x_2^2 + \beta_{2000} x_1^2 + \beta_{2001} x_1^2 x_4 + \beta_{2010} x_1^2 x_3 + \beta_{2100} x_1^2 x_2 + \beta_{3000} x_1^3 + \varepsilon $$

The script itself is

d <- 3
formula <- "$$y = "  
`%$%` <- paste0
for (i1 in 0:d) {  
    for (i2 in 0:(d-i1)) {  
        for (i3 in 0:(d-i1-i2)) {  
            for (i4 in 0:(d-i1-i2-i3)) {  
                formula <-   
                    formula %$%   
                    "\\beta_{" %$%   
                    i1 %$% i2 %$% i3 %$% i4 %$%   
                    "}" %$%   
                    (if (i1>0) (" x_1" %$% if (i1>1) ("^" %$% i1))) %$%   
                    (if (i2>0) (" x_2" %$% if (i2>1) ("^" %$% i2))) %$%   
                    (if (i3>0) (" x_3" %$% if (i3>1) ("^" %$% i3))) %$%   
                    (if (i4>0) (" x_4" %$% if (i4>1) ("^" %$% i4))) %$%  
                    " + "  
            }  
        }  
    }  
}  
formula <- formula %$% "\\varepsilon$$"  
cat(formula)  

At last, notice that it generated no fourth degree terms, like $\beta_{1111} x_1 x_2 x_3 x_4$, since you required a third degree polynomial ($d=3$) with four variables ($p=4$).