Solved – Linear vs polynomial regression

machine learningmathematical-statisticsregression

If you have multiple dimensions to your data, where it is not possible to visualize
them together, how to decide if your model should be linear or polynomial?

Best Answer

If you did the same course as me, Andrew Ng's machine learning course on coursera, you will remember that it was suggested there to split the data into 3 parts.

  1. Training set
  2. Cross-validation set
  3. Test set

Briefly, you fit each possible model to the training set. However you can't easily use the training set to decide which model is best, because additional terms in a model can only lead to a model fitting the training set better, with the possibility of overfitting (this is when the regression coefficients are "tuned" to noise in the training data). Rather you choose which model to use based on the error on the cross-validation set. Finally, you test how well the chosen model predicts using the test set.

If your model is highly dimensional, then it might be best to consider other approaches.