Solved – Interpreting partial dependence plots (marginal effects) using plotmo

boostingmachine learningrrandom forest

I'm computing partial dependence plots to visualize the relationship between predictors (including interactions) and outcome. I study patients with brain tumours and mortality is very high (90%). I use random forest (randomForest package) to perform binary classification (dead yes/no). Moreover, I use the plotmo package, which appears to be outstanding for visualizing such relationships, to visualize the variable relationships. Below is the output from plotmo, where the interaction between age and metastasis is visualized:

enter image description here

My questions are actually noted on the image. Using the built-in partial dependence plots in the randomForest package does not help to clarify these things). Reading the package tutorials on CRAN did not help as well.

Please refer to the image above to see what is unclear regarding this.

I'm deeply grateful for any thoughts you guys may offer.

Best Answer

About the vertical axis:

The plotmo function calls predict internally to generate the graph. So the vertical axis will be plotted in whatever units predict returns for your model.

In your example for a randomForest model, the default prediction type is "response" (see the help page for predict.randomForest). Change this by passing type="prob" to plotmo, which plotmo will pass on internally to predict.randomForest.

For example (see the two graphs on the left):

library(rpart.plot); data(ptitanic)  # for ptitanic data
dat <- ptitanic[,c("survived", "sex", "pclass")] # classic example
library(randomForest)
mod <- randomForest(survived~., data=dat)
plotmo(mod) # default predict type is "response"
plotmo(mod, type="prob")

plot

About the horizontal axes:

Since the persp function accepts only numeric (not factor) arguments , plotmo converts factor variables to numeric before invoking persp internally. Thus a factor of say "blue", "green", "red" will be plotted as 1,2,3 in plotmo's perspplot (1,2,3 are the integers used internally by R to represent the factor).

Axis labels:

Get more information on the axes by invoking persp with ticktype="detailed". To do this, pass persp.ticktype="detailed" to plotmo (any plotmo argument prefixed by persp. gets passed on internally to the persp function, this is described near the bottom of the plotmo help page).

For example (see the two graphs on the right):

plotmo(mod, type="prob", persp.ticktype="detailed")
plotmo(mod, type="prob", persp.ticktype="detailed", persp.nticks=3)

Bear in mind that plotmo does automatic determination of the response axis range. See the ylim argument of plotmo and Section 5.4 of the vignette for the plotmo package. This automatic determination can sometimes cause surprising results.

To plot partial dependence graphs, don't forget that we need to pass type="partdep" to plotmo. See Chapters 1 and 9 of the plotmo vignette.