Solved – Bonferroni for outlier detection

multiple-comparisonsoutlierstime series

I am reading a book on time series analysis and I am having problems understanding the section about outlier detection.

The authors say that when you want to know whether at a certain time $T$ there was an outlier, you should use a certain test statistic and a test with size less than $\alpha$. But when you don't know where an outlier could be and you have a time series of size $n$ then you should use the same test statistic for each point but you should use tests of size $\alpha/n$. They say that this is an application of the conservative Bonferroni correction.

I just don't understand this. Doesn't this mean that there will be lots of outliers that you detect in short time series but don't detect in large ones? After all, spam filters don't have stronger spam criteria for people with more incoming email, right?

Best Answer

Try generating some data from a normal distribution, first generate a small sample and look at the spread of the points, now add a few more points, then more, then more. You will notice that as the sample size gets bigger you will see more extreme values (potential outliers) just by chance alone. If you don't do some adjustment for multiple comparisons then you will see much more significance in large sample sizes just due to the large sample size when the underlying process is stable and all the data points are legitimate (inliers?).

Related Question