Solved – Why does Lasso do better than SVM

cross-validationlassorandom forestrmssvm

I have been evaluation various regression techniques over a regression dataset . I am surprised by the fact that cross-validated RMSE of Lasso is better than SVM and Random Forest in my case.

Can this happen? I believed that a non-linear modelling technique like random forest or SVM would do better than a linear model like Lasso.

Is that really possible!?

Best Answer

There is no perfect algorithm. I believe Loess, at least as implemented in R, is limited to ~4 features. Given so few features, the overhead of RandomForests or SVM-regression is likely wasted. It might be that the intrinsic scaling of the data is important and the RandomForest loses that in it's trees. For the SVM it could easily be the difficulty in properly tuning it or choosing the right kernel. If the relationship is simple enough, you don't need to expand in the faux-infinite dimensions of kernel space to understand it.

Having said that, just because Loess is better in this particular training set via cross-validation, that doesn't mean it will always be better. All models are just approximations.