Outliers – What to Do with Outliers? Use Capping, Removal, or Non-Parametric Tests?

anovaoutlierst-test

This will be my first question on Cross Validated, and besides, no one has ever taught me statistics. I am completely self-taught in this regard. So please forgive me if my question seems trivial.

I would like to ask about the handling of outliers. I have a value table with different variables that come from three different trials. I wanted to use ANOVA followed by a t-test to compare the results obtained in each of the three trials. Unfortunately, for some tests, I identify one to several outliers. Below is an example diagram where outliers are marked with larger points.

enter image description here

But I already know that when there are outliers in the variable, I shouldn't use ANOVA. OKAY. I can reach for a non-parametric test, eg Kruskal-Wallis + Wilcoxon test.

enter image description here

However, I was wondering if removing outliers would be a better way to improve this data. However, I am concerned that when removing measurements from a given sample, I should also remove the corresponding results from the first and second samples. So that in the ANOVA test, each vector has the same length.
So I was looking for a different solution. I found that in this case, one of three methods: Imputation, Capping and Prediction can be used. All right. I did Capping and it's much better now.

enter image description here

And now my question. What path should I go. Should I use nonparametric tests? Or maybe it is better to remove outliers values and analyze a shortened set? Or maybe it's better to use Capping (or another method of data improvement)? Which solution will be the best in this case?

Any suggestions will be appreciated.

An update for everyone

At the beginning of my little update, I would like to thank everyone who took an interest in my question. However, instead of writing a few separate comments, I decided to put it together in one update of the question.

Now a little more about my data. The data is from some medical experiment. I do not know the details. I only have data. Each patient underwent three tests (NEG-CRT, NEG-NOC and NEU-NOC). In each trial, some medical devices took 25 measurements of several different measurement signals. The data shown in the above graphs is data for a signal called LPP2FCz. The most important question I have to answer is whether the responses for these three trials are statistically significantly different.

One-way ANOVA + t-test seemed to be a good approach here. Unfortunately, in about 30% of the results, I cannot confirm that the data comes from a normal distribution (tested with the Shapiro-Wilk test). In addition, outliers appear in some cases. For this, my first step was to move towards non-parametric tests.

However, when analyzing the obtained results (regardless of whether it is ANOVA or Kruskal-Wallis), it is difficult to talk about statistically significant differences. Although, as @Michael Lew rightly pointed out, in the case of the NEU-NOC trial, we have the most outliers. However, can this be the basis for the inference that the results of this sample differ from the previous two trials?

Answering the question @Sal Mangiafico, the outliers were automatically determined by the geom_boxplot function from R. And as far as I know, they are calculated as values that are outside the range (quantile(x, .25)-1.5*IQR(x), quantile(x, .75)+1.5*IQR(x)).
As for the Capping procedure, all lower outliers have been converted to quantile(x, .05) and upper outliers have been converted to quantile(x, .95).

Summing up and taking into account all valuable comments, my conclusions for today are coming.

  1. I will definitely not remove any outlier values. This would definitely be the wrong way.
  2. I will stay with non-parametric tests. Artificially "forcing" data to come from the normal distribution is definitely not a good move.
  3. I hesitate to leave Capping or not, and only make inferences based on the observed outliers.

I will be waiting impatiently for all your suggestions.

Best Answer

Generally outliers should only be removed if there is some clear suspicion (with reasons) that they are actually wrong. The outliers don't seem particularly harmful to me here. Kruskal-Wallis makes some sense and the p-value is about the same as from the t-test, so the overall message is pretty clear that there's no evidence that there are differences.

As there are several ways to deal with outliers, one can try out many things and at some point one may find something that gives a significant p-value, but note that p-values are generally invalid if they rely on some data dependent selection of results. There's no basis by the way as far as I'm aware for manipulating the outliers by capping, imputation, or "prediction" before running a t-test, so I'd leave my hands off these. The thing is, if the outliers are incorrect observations, they should be removed, and if they're correct, you're not going to improve your analysis by replacing them with something else that in fact is incorrect (which follows directly from the outliers being correct).

Not sure whether you mean by "better" in "I did Capping and it's much better now" that a significant result is better. That'd be dangerous because it means that you may tend to keep on analysing until you find significance, which is then a result of wishful thinking. The fewer tests you run, the more appropriate they will be (I wouldn't quite go as far as saying that not running any test is the most appropriate thing, but some would...). If the data don't show evidence that there are differences, you shouldn't try to wrestle a low p-value out of them.