Solved – Some of the data is not normally distributed, what test should i use

anovadescriptive statisticsmathematical-statisticsnormality-assumptionstatistical significance

When testing for normality and homogeneity of variance in SPSS, it showed this:

enter image description here

If I go by Kolmogorov-Smirnov, than the 'M' data is not normal, but if I go by Shapiro-Wilk, they all are normally distributed.
However, the test of homogeneity of variance shows that based on the Mean and based on the trimmed mean, equal variance is not assumed. I know this would change which Post-hoc test I use for BG Anova but I'm not sure how it would affect my data if I use a Kruskal-Wallis Anova?

Should I use a Kruskal-Wallis ANOVA or a 1-way Between Groups ANOVA?

Thank you!

Best Answer

With such relatively small samples, I would not expect definitive results from either the Shapiro-Wilk or the Kolmogorov-Smirnov tests. Usually, the latter has poorer power than the former so I wonder why K-S (alone) finds group M data non-normal. Even though all six of the P-values for normality tests are about the same, I would want to see whether there are far outliers in any of the three groups; if not, I would not worry much about nonnormality.

I think your main problem may be heteroscedasticity, and I would use an ANOVA procedure designed to take possibly-unequal group variances into account. You may be familiar with the Welch two-sample t test, which does not assume equal variances of the two groups. In its procedure 'oneway.test', R implements a one-way ANOVA that does not assume equal variances. (Adjustments for unequal variances are similar to those of the Welch t test.) I would use this test in preference to a Kruskal-Wallis test because that test explicitly requires populations to be of the 'same shape', which implies 'equal variances'.

I do not know whether SPSS has implemented a one-way ANOVA procedure that does not require homoscedasticity.

The following normal data are simulated (in R) to have relatively modest differences among group means and markedly different variances among group variances.

set.seed(2020)  # for reproducibility
a = rnorm(20, 100, 10)
b = rnorm(20, 105, 5)
c = rnorm(20, 112, 15)
x = c(a,b,c)
g = as.factor(rep(1:3, each=20))

boxplot(x ~ g, col="skyblue2")

enter image description here

The "Welchified" one-way ANOVA test finds significant differences among groups at about the 2% level of significance. (In a standard one-way ANOVA the denominator df would be 57; here ddf are about 31, adjusting for heteroscedasticity.)

oneway.test(x ~ g)

        One-way analysis of means (not assuming equal variances)

data:  x and g
F = 4.5939, num df = 2.000, denom df = 31.383, p-value = 0.01779

Ad hoc Welch two-sample t test show groups A and B to differ at the 2% level (so, of course, A and C differ also). There is no significant difference between B and C. According to the Bonferroni method of protecting against false discovery, it is reasonable to conclude that A differs from B and C.

Perhaps your data are sufficiently similar to my simulated data that your data can be profitably analyzed using the methods I show above.

Related Question