Solved – What are the differences between generalized additive model, basis expansion and boosting

boostingmachine learningsplinesterminology

I am confused with the term

  • Generalized additive model
  • Basis expansion
  • Boosting

If we fit a data with "spline basis", is it a "Generalized additive model"? To me it is just a linear model with different basis, we can do it with polynomial basis or Fourier basis etc.

Also there is a notion "additive" in "Generalized additive model" how it different from boosting?

Best Answer

Basis expansion implies a basis function. In mathematics, a basis function is an element of a particular basis for a function space. For example, sines and cosines form a basis for Fourier analysis and can duplicate any waveform shape (square waves, sawtooth waves, etc.) just by adding enough basis functions together. From Basis (linear aglebra) "In mathematics, a set of elements (vectors) in a vector space V is called a basis, or a set of basis vectors, if the vectors are linearly independent and every vector in the vector space is a linear combination of this set." The object of finding basis functions is to create a spanning set. For example, "The real vector space R$^3$ has {(-1,0,0), (0,1,0), (0,0,1)} as a spanning set. This particular spanning set is also a basis. If (-1,0,0) were replaced by (1,0,0), it would also form the canonical basis of R$^3$."

For machine learning "basis expansion" is merely a fancy way of saying "adding more linear terms to the model." The term is, for example, used precisely once in BOOSTING ALGORITHMS: REGULARIZATION, PREDICTION AND MODEL FITTING By Peter Buhlmann and Torsten Hothorn, so that if you missed the meaning entirely, you would not be out by much.

Generalized additive model in that same Buhlmann and Hothorn paper (which I would recommend reading) is just a way of saying we can add more linear terms to the model (e.g., used for Adaboost) to get an improvement in algorithm performance. This is just arithmetic addition of linear terms, so that covariance, interdependence, or interaction between terms is ignored. This has its limitations because probability densities add by convolution when they interact, which is not arithmetic addition.

Boosting is a machine learning ensemble meta-algorithm for primarily reducing bias, and also variance in supervised learning, and a family of machine learning algorithms which convert weak learners to strong ones. Boosting is based on the question posed by Kearns and Valiant (1988, 1989) "Can a set of weak learners create a single strong learner?" A weak learner is defined to be a classifier which is only slightly correlated with the true classification (it can label examples better than random guessing). In contrast, a strong learner is a classifier that is arbitrarily well-correlated with the true classification. Math wise, this looks like weightings of the classifiers, where AdaBoost is the best known and other ensemble techniques are random forest and bagging.

Related Question