Solved – Why do we use standard deviation in machine learning context? What is it used for

machine learningpythonstandard deviation

I have a small question. What sense does it have if we calcualte the standard deviation in a machine learning context? I mean i know what it is.

A low standard deviation means that most of the numbers are close to the mean (average) value.

A high standard deviation means that the values are spread out over a wider range.

But what does it mean to have a high standard deviation? or a low? What importance does it have to a machine learning model?

Best Answer

The standard deviation can be used to identify outliers in your data. In a simple analysis, it is not uncommon to ignore data points that are say 2 or 3 standard deviations from the mean. In unsupervised learning, the standard deviation (or variance) can be used for anomaly detection. For example, if the data corresponding to a manufactured part indicates that it is several standard deviations away from your cluster of "good" parts, you could flag the part as having a defect.