How well do Multivariate Adaptive Regression Splines work in high dimensional settings

marsnonparametricnonparametric-regressionsplines

I have been reading the Hastie and Tibshirani book again lately, and I noticed in Chapter 9 that the mention the MARS algorithm: Multivariate Adaptive Regression Splines, which is a nonparametric method for fitting a curve to some data.

My question was, how well does this technique work on high-dimensional data? If I remember correctly, one of the key arguments against nonparametric or local regression models was that they did not work very well on high dimensional problems. In high dimensions, the points are already so far away, that you end up creating very wide bins for the data, and this defeats the purpose of local regression or nonparametric methods.
This problem was one of the reasons why Tree based methods became so popular.

However, the Hastie and Tibshirani book., Elements of Statistical Learning introduces the MARS algorithm in the same chapter as Tree based methods. So I can't tell if MARS overcomes the problem with dimension that other similar methods have. Does anyone know the answer?

Note that I checked Wikipedia and the HT book itself, but they don't address this question.

Best Answer

In high dimensions, the points are already so far away, that you end up creating very wide bins for the data, and this defeats the purpose of local regression or nonparametric methods.

This 'curse of dimensionality' mostly occurs when the predictors are considered in tandem; not when variables are considered one-by-one, which is essentially what tree-based methods do, and MARS also. In each of the separate dimensions spanned by the predictors variables, the distance between points does not increase.

I expect MARS can do well in high dimensions, if interaction depth is kept low and probably some sparsity should be enforced. The latter can be done by not allowing the forward pass to select a large number of terms, and/or not allowing the backward pass to retain a large number of terms.

Related Question