Solved – Mean Squared Error changes according to scale of value in machine learning regression problem

machine learningmeasurement errormodel-evaluationpythonregression

I am working on a machine learning regression problem and I have chosen the metrics Mean Absolute Error (MAE) and 'Mean Squared Error (MSE). I have 3 features and two of them have values in the range 0.007 – 0.009 and the third feature's values range from 1.18 to 1.19. The predicted value/output should also lie in the range 0.007 – 0.009. When I implement a simple linear regression model using scikit learn in Python, I get the MSE to be about 2.037727147668752e-07. However I noticed if I multiplied all my features and the value to be predicted by say 100, the MSE changed to 0.0024. Can someone please explain

  1. If the MSE is a metric that is to be used on a relative scale, how do I interpret it? Does it mean an error of 0.002 means that if my actual value is 0.008, my predicted value is 0.008 +/- 0.002 = 0.006 or 0.01?

  2. This is a large difference between the actual and predicted values, are there any specific regression machine learning models that work well for this kind of problem? Will normalizing the data or scaling it help improve performance and if so, why?

  3. I noticed that MAE remained constant regardless of the scale. Is this an absolute error measure?

  4. What other metric can I use to evaluate the performance of my model?

Best Answer

When I implement a simple linear regression model using scikit learn in Python, I get the MSE to be about 2.037727147668752e-07. However I noticed if I multiplied all my features and the value to be predicted by say 100, the MSE changed to 0.0024.

When you multiply your training data by 100, then your predictions will also change by a factor of (about) 100. The MSE is the mean of the squared differences between actuals and predictions. If you scale both actuals and (roughly) predictions by a factor of 100, the difference is also scaled by 100, so the square of the difference is scaled by 10,000. It works out. The features don't have anything to do with this effect.

If the MSE is a metric that is to be used on a relative scale, how do I interpret it? Does it mean an error of 0.002 means that if my actual value is 0.008, my predicted value is 0.008 +/- 0.002 = 0.006 or 0.01?

The MSE is not a relative measure. It is just the mean of the squared errors. Yes, this is hard to interpret. You may want to look at Mean absolute error OR root mean squared error?

This is a large difference between the actual and predicted values, are there any specific regression machine learning models that

work well for this kind of problem? Will normalizing the data or scaling it help improve performance and if so, why?

Scaling and normalizing will usually not help (except that scaling will scale the MSE, as above, but that is not helpful). Without knowing much more about your data, the best we can do is suggest How to know that your machine learning problem is hopeless?

I noticed that MAE remained constant regardless of the scale. Is this an absolute error measure?

This should not happen. The MAE is the mean of absolute errors. Scaling the actuals (and therefore also the predictions) should scale the MAE by the same amount.

What other metric can I use to evaluate the performance of my model?

This may be helpful - it's written in the context of time series forecasting, but you can apply it in other contexts, too.