Solved – Outlier detection function in R for known distributions

outliersr

Assuming that I have a one-dimensional data set with a known distribution (i.e. normal, gamma, Weibull, etc.), is there a R function that I can call on the data set that will return the anomalies?

I know that anomaly detection in a known normally distributed data set is pretty straightforward but I couldn't even find a R functions for this, let alone some of the more complicated distributions.

Best Answer

Although no doubt functions exist that may be helpful, this is just to underline that what you want is more difficult to do definitively than you imply.

Even for the case of a normal (Gaussian) distribution as reference distribution

  • If a data set is known to be normally distributed, there can't be anomalies.

  • If it is thought to be normally distributed except for the anomalies, telling the two apart requires estimation of the mean and standard deviation of the normal component, or something that is equivalent. There are numerous ways to do that, which means that the problem is very open-ended.

The other examples you give (gamma, Weibull) are loosely speaking more difficult still.