R Metric – How to Compare Models Using Metrics?

mapemetricr

Using R, I have developed three models:

  • linear regression using lm();
  • decision tree using rpart();
  • k-nearest neighbor using kknn().

I would like to conduct leave-one-out cross-validation tests and compare these models. However, which error metric should I use for better representation? Does mean absolute percentage error (MAPE) or sMAPE (symmetric MAPE) look fine? Please suggest me a metric.

For example, when I conducted leave-one-out CV tests on linear regression (LR) and decision tree (DT) models, the sMAPE error values are 0.16 and 0.20. However, the R-squared values of LR and DT are 0.85 and 0.92 respectively. Where sMAPE computed as [sum (abs(predicted - actual)/((predicted + actual)/2))] / (number of data points). Here DT is pruned regression tree. These R^2 values are computed on full data set. There are a total of 60 data points in the set.

Model  R^2   sMAPE
 LR    0.85   0.16
 DT    0.92   0.20

Best Answer

Lots of metric exist and no one is generally the best to use, it depends of your problem, of your data. Often, many metric can be used. I find usefull, to compute both hypothesis test and different metric (RMSE, MAPE ...), and see if they provide similar result. So your conclusions won't be based only on one metric.

Related Question