Standard Deviation and Bias – Understanding Root Mean Squared Error and Mean Bias Deviation

biasstandard deviation

I would like to gain a conceptual understanding of Root Mean Squared Error (RMSE) and Mean Bias Deviation (MBD). Having calculated these measures for my own comparisons of data, I've often been perplexed to find that the RMSE is high (for example, 100 kg), whereas the MBD is low (for example, less than 1%).

More specifically, I am looking for a reference (not online) that lists and discusses the mathematics of these measures. What is the normally accepted way to calculate these two measures, and how should I report them in a journal article paper?

It would be really helpful in the context of this post to have a "toy" dataset that can be used to describe the calculation of these two measures.

For example, suppose that I am to find the mass (in kg) of 200 widgets produced by an assembly line. I also have a mathematical model that will attempt to predict the mass of these widgets. The model doesn't have to be empirical, and it can be physically-based. I compute the RMSE and the MBD between the actual measurements and the model, finding that the RMSE is 100 kg and the MBD is 1%. What does this mean conceptually, and how would I interpret this result?

Now suppose that I find from the outcome of this experiment that the RMSE is 10 kg, and the MBD is 80%. What does this mean, and what can I say about this experiment?

What is the meaning of these measures, and what do the two of them (taken together) imply? What additional information does the MBD give when considered with the RMSE?

Best Answer

I think these concepts are easy to explain. So I would rather just describe it here. I am sure many elementary statistics books cover this including my book "The Essentials of Biostatistics for Physicians, Nurses and Clinicians."

Think of a target with a bulls-eye in the middle. The mean square error represent the average squared distance from an arrow shot on the target and the center. Now if your arrows scatter evenly arround the center then the shooter has no aiming bias and the mean square error is the same as the variance.

But in general the arrows can scatter around a point away from the target. The average squared distance of the arrows from the center of the arrows is the variance. This center could be looked at as the shooters aim point. The distance from this shooters center or aimpoint to the center of the target is the absolute value of the bias.

Thinking of a right triangle where the square of the hypotenuse is the sum of the sqaures of the two sides. So a squared distance from the arrow to the target is the square of the distance from the arrow to the aim point and the square of the distance between the center of the target and the aimpoint. Averaging all these square distances gives the mean square error as the sum of the bias squared and the variance.