Solved – Are there terms to distinguish between the two types of outliers

outliersterminology

I hate talking about "outliers," because I view that term as encompassing two entirely different concepts. The first is when it refers to data that was incorrectly recorded or measured. For instance, a 400000 bedroom house for 4 dollars or a person that is 180 inches tall. There are various techniques to handle these that are outside the scope of my question.

The second type of outlier refers merely to extreme points on the distribution. These might include things like Babe Ruth's home run totals or the net worth of Bill Gates. Although these can have adverse effects on certain types of analyses, they are nevertheless legitimate datapoints. They also have their own techniques that I'd rather not get into here.

Are there any terms that are used to distinguish between the two cases of outliers?

Best Answer

I've come across three terms; outlier, leverage point, and influential point.

A reference for you to investigate is Chapter 11 of Applied Regression Analysis, Linear Models, and Related Methods By John Fox. This chapter contains a discussion about Unusual and Influential data. The figure on page 268 is rather instructive.

Related Question