Regression – Importance of Optimizing the Correct Loss Function Explored

loss-functionsmeasurement erroroptimizationregression

I want to understand the importance of optimizing the correct loss function. Say that I am building a linear regression model $p$ for predicting some values $y_1,\ldots,y_n$.

I choose to fit my linear model such that it minimizes the mean squared errors. Now I send my model off to a statistical prediction competition, where they instead of using MSE as an error metric uses mean absolute error (so no squares). How will this affect my models predictive power? In general, what can be said about the importance of optimizing the "correct" loss function?

Edit: If one should put the question in a more specific context, then that context would be predictive modelling competitions like those featured on Kaggle.com. I want to understand the importance of choosing models and loss functions which correspond to the evaluation metric for the competition. One reason for this is this comment by the winner of a Kaggle competition.

Best Answer

Say that I am building a linear regression model p for predicting some values $y_1,…,y_n$.

If the data contains a few extreme outliers in the response - or even just one - the MSE fitted equation can be pulled arbitrarily far away from the MAE one.

Consider the simplest regression model (just an intercept, $\alpha$), and following data:

  0.0003 0.0001 0.0002 0.0004 50000 0.0002 0.0004 0.0003 0.0001 0.0003

The MAD solution is $\alpha$ = 0.0003. The MSE solution is 5000.00023.

The MAD of the minimum MAD solution is about 0.0001. The MAD of the minimum MSE solutions is about 5000. You can potentially do very badly, if you use MSE when the criterion is MAD.

Related Question