Solved – Partial residual plot with interactions

The NIST website's description of the partial residual plot says that it plots

$$
\text{Res}+\hat\beta_iX_i\text{ versus } X_i
$$

where

$\text{Res}$ = residuals from the full model
$\hat\beta_i$ = regression coefficient of $X_i$
$X_i$ = the $i$th independent variable in the full model

This decription is clear enough when the full model is of the form

$$
Y=\beta_0+\beta_1X_1+\beta_2X_2+\cdots+\beta_iX_i+\cdots
$$

However, consider the following full models which include interaction effects:

$$
(1)\qquad\qquad Y=\beta_0+\beta_1X_1+\cdots+\beta_iX_i+\cdots+\beta_kX_i^2+\cdots
$$
$$
(2)\qquad\qquad Y=\beta_0+\beta_1X_1+\cdots+\beta_iX_i+\cdots+\beta_kX_iX_j+\cdots
$$
$$
(3)\qquad\qquad Y=\beta_0+\beta_1X_1+\cdots+\beta_iX_i+\cdots+\beta_kX_iD+\cdots
$$

where $D$ is a categorical dummy variable (i.e. it takes the values 0 or 1) in the second model. Would the partial residual plot in both cases neglect the $\beta_k$ (i.e. interaction part) or would you plot (these are my own suggestions):

$$
(1)\qquad\qquad \text{Res}+\hat\beta_iX_i+\hat\beta_kX_i^2\text{ versus }X_i
$$
$$
(2)\qquad\qquad \text{Res}+\hat\beta_iX_i+\hat\beta_kX_iX_j\text{ versus }X_i
$$
$$
(3)\qquad\qquad \text{Res}+\hat\beta_iX_i+\hat\beta_kX_iD\text{ versus }X_i
$$

(1) makes sense to me, but (2) and (3) do not – since what would you take as the values of $X_j$ and $D$ in the plot? The values that correspond to the $X_i$ in the data set?

As far as I know, partial residual plots in R do not even support interaction terms. So I'd like someone to confirm how cases (1), (2) and (3) should be properly handled for a partial residual plot.

Best Answer

There is a simple procedure (for your case (2) ) illustrated at https://stackoverflow.com/a/24964685, but there the x-axis shows the product $X_i X_j$, not $X_i$. The underlying logic, I assume, is that if you are concerned about the linearity of the relationship captured by the $\beta_k something$ then you construct the partial residual with that "something", the regressor $k$ which, in (2) will be the regressor formed by the product of $X_i X_j$, etc. I guess that could also be used for case (1) (interestingly, plotting $e + \hat{b_2} x_2 + \hat{b_{22}} x_2^2$ against $x_2$ is already discussed in R. Dennis Cook's paper "Exploring partial residual plots"; see section 3., p. 354).

I think a more general way to deal with interactions and partial residual plots can be found in John Fox and Sanford Weisberg's Visualizing Lack of Fit in Complex Regression Models: Adding Partial Residuals to Effect Displays (see pp. 19 and ff.). These ideas are implemented in the effects CRAN package by John Fox and collaborators. In a question I asked in stackoverflow about one of the functions I provide code that shows how to compute and plot those partial residuals for your cases (2) and (3).

Your case (3) is handled in the effects package by evaluating that sum as you indicate (each case at its observed values of $X_i$ and $D_i$) and then plotting the partial residual vs $X_i$ conditioning on $D$ (i.e., with different panels for different values of $D$).

With interactions between continuous predictors (case (2)), one must slice one of the predictors, and that can introduce some bias (see slides 21 and 34 of Fox and Weisberg).

Best Answer

Related Solutions

Solved – Weighted Least squares, why not use $\frac{1}{e_i^2}$ as weights

Solved – Relationship between noise term ($\epsilon$) and MLE solution for Linear Regression Models

Related Question