Solved – Hard in calculating predictor‘s Relative Importance for GAM

generalized-additive-modelimportancenonlinear regressionregression

Although there is no agreement upon "relative importance for predictors" with (even) linear models (one possible definition: lmg method), I would still want to know whether there are some acceptable methods to do it, if I build a Generalized Additive Model.

It's a natural question about which predictor is more important or useful (quantitatively, e.g., using percentage), isn't it?

I found relaimpo package can calculate several relative importance metrics for the linear model, but it can not handle GAM models (see Here).
Here is an example:

library(relaimpo)
library(mgcv)
gam1 <- gam(mpg ~ s(drat) + s(wt) + s(qsec), data = mtcars, method = "REML")
summary(gam1)

From the summary() result, we can see which predictor is "significant" by p-value:

Approximate significance of smooth terms:
          edf Ref.df      F  p-value    
s(drat) 1.000  1.000  0.523 0.476069    
s(wt)   2.487  3.028 21.950 1.59e-08 ***
s(qsec) 1.000  1.000 15.241 0.000545 ***

But we don't know their "relative importance", for example, can we get the following information?

`wt` has a relative importance of 60%, 
`qsec` has a relative importance of 30%, 
`drat` has a relative importance of 10%. 

What's worse, because GAM doesn't have a real R-squared, I suppose lmg method cannot be applied.

Best Answer

The caret package provides one answer. With the default tuneGrid and trainControl,

library(caret)
data("mtcars")
gam1 <- train(
  mpg ~ drat + wt + qsec, 
  data = mtcars, 
  method = "gam"
)

and you can then apply varImp.

varImp(gam1)
## gam variable importance
##      Overall
## wt     100.0
## qsec    26.4
## drat     0.0

For sort of the percentage-idea that you wanted, you can resize the returned object:

library(dplyr)
x <- varImp(gam1)
x$importance %>%
  mutate(
    Variable = rownames(.), Overall = Overall / sum(Overall) * 100
  ) %>% 
  arrange(desc(Overall)) %>%
  select(Variable, Overall)
##   Variable Overall
## 1       wt   79.09
## 2     qsec   20.91
## 3     drat    0.00

Because the default will not tune splines or degrees of freedom, you should check how to do these in the caret package. The method = 'gam' will call the mgcv package, but there are plenty other options. For instance if you used method = 'gamSpline', it would tune over the degrees of freedom, and give a different varImp result.

Be wary of what caret is doing under the hood, however---if there are not many distinct values in a predictor, it may turn the term into linear.

Related Question