Solved – Is it possible to extract feature importances from a KNN model

caretk nearest neighbourmachine learningr

For tree-based models, I've used varImp in caret to extract feature importances; however, this doesn't work with KNN. Can someone explain why this is/if this is possible? Thanks!

Best Answer

I don't know of a canned command, but you could always measure how much the mean-squared error (or misclassification rate) increases when a variable is either removed or permuted.

Related Solutions

Solved – Identical variable importance values for different model types

Alright, looking into the code of varImp.train I see, that in case of a classification problem, the variable importance is just computed via the filterVarImp function. So the variables are just ranked by the AUC, as stated in the documentation varImp under model independent metrics.

I tested it, by calling varImp on each of my models and comparing the variable importance values with the ones computed via filterVarImp on the training data.

## Compute variable importance via filter approach
varImps.filtered <- filterVarImp(trainData, trainClasses)
varImps <- list(knn=varImp(models$knn, scale=F),
                pda=varImp(models$pda, scale=F))

## Sort variable importance by their average value 
## over all classes in decreasing order.
varImps.filtered$Mean <- apply(varImps.filtered, 1, mean)
varImps.filtered <- varImps.filtered[with(varImps.filtered, order(-Mean)), ]
varImps.filtered$Mean <- NULL

... and surprise surprise, it is exactly the same.

> varImps$knn

ROC curve variable importance

  variables are sorted by maximum importance across the classes
    Class_1 Class_2 Class_3 Class_4 Class_5
V5   0.7094  0.9912  0.9431  0.9231  0.9912
V3   0.3706  0.5631  0.9744  0.9744  0.7831
V9   0.9725  0.9619  0.9725  0.8125  0.8988
V8   0.6887  0.6644  0.8650  0.9531  0.9531
V4   0.9325  0.9194  0.9325  0.6044  0.3138
V10  0.7250  0.8119  0.8544  0.8544  0.8331
V7   0.8169  0.7606  0.8244  0.7025  0.8244
V6   0.3650  0.5775  0.7838  0.8081  0.8081
V11  0.6194  0.7662  0.7662  0.6000  0.6506
V2   0.5138  0.7412  0.7412  0.5938  0.4031
U5   0.5609  0.5731  0.5731  0.4944  0.4834
U4   0.5259  0.5531  0.5531  0.5103  0.5109
U3   0.5134  0.5134  0.5103  0.5384  0.5384
U2   0.5384  0.5203  0.5216  0.5384  0.5219
U1   0.4853  0.5312  0.5312  0.5238  0.4872

> varImps.filtered
      Class_1   Class_2   Class_3   Class_4   Class_5
V9  0.9725000 0.9618750 0.9725000 0.8125000 0.8987500
V5  0.7093750 0.9912500 0.9431250 0.9231250 0.9912500
V8  0.6887500 0.6643750 0.8650000 0.9531250 0.9531250
V10 0.7250000 0.8118750 0.8543750 0.8543750 0.8331250
V7  0.8168750 0.7606250 0.8243750 0.7025000 0.8243750
V4  0.9325000 0.9193750 0.9325000 0.6043750 0.3137500
V3  0.3706250 0.5631250 0.9743750 0.9743750 0.7831250
V11 0.6193750 0.7662500 0.7662500 0.6000000 0.6506250
V6  0.3650000 0.5775000 0.7837500 0.8081250 0.8081250
V2  0.5137500 0.7412500 0.7412500 0.5937500 0.4031250
U5  0.5609375 0.5731250 0.5731250 0.4943750 0.4834375
U4  0.5259375 0.5531250 0.5531250 0.5103125 0.5109375
U2  0.5384375 0.5203125 0.5215625 0.5384375 0.5218750
U3  0.5134375 0.5134375 0.5103125 0.5384375 0.5384375
U1  0.4853125 0.5312500 0.5312500 0.5237500 0.4871875

My goal was to come up with a model specific stable feature selection method. The only way I see to achieve this now, is by utilizing caret's inbuilt feature selection methods like "Recursive Feature Selection (RFE)" and "Selection by Filter (SBF)". As far as I understand it, RFE however is only supporting a handful of models in caret. So I might be forced to implement the rfeControl$functions myself.

Solved – How to extract global feature importances of a black box model from local explanations with LIME

SHAP (SHapley Additive Predictions) offers a bar plot that is "the mean absolute value of the SHAP values for each feature" for exactly this purpose I think.

The author of SHAP describes both SHAP and LIME as "additive feature attribution methods." See section 2 here.

Best Answer

Related Solutions

Solved – Identical variable importance values for different model types

Solved – How to extract global feature importances of a black box model from local explanations with LIME

Related Question