Solved – One-Way ANOVA – Interpreting the results

anovar

I'm tyring to better understand ANOVA. I did a simple one-way ANOVA, in R, as follows:

Year represents the year a student is in college (Freshman, Sophmore or Junior). Score represents their score on a test. Is the following interpretation correct:

Since the p-value of the ANOVA is .756, we can conclude there is no difference in the means of the three groups of the Year factor and Year does not significantly impact the mean Score.

This sounds correct to me but I'm a little unsure how to fully interpret one-way results.

Best Answer

in one way anova, the tested hypothesis is:

h0: b.Freshman = b.Sophmore = b.Junior = 0

h1: else

(b standing for the group coefficient)

so basically your result means that the variance between groups is small and hence cannot be a good explenation to the overall variance in the dataset.

generally ANOVA stands for analysis of variance. unlike regression models it does not try to estimate the coefficients, but rather give a simple answer to the question: "is there any significant difference between the groups". or in other words "how much of the total variance in the dataset can be explained by dividing the data into given groups?"

Related Solutions

Solved – Interpreting 5-way Mixed Model ANOVA

Generally, you should start from the highest order interactions. You are probably aware that it is usually not sensible to interpret a main effect A when that effect is also involved in an interaction A:B. This is because the interaction tells you that the effect of A actually depends on the level of B, rendering any simple main effect interpretation of A impossible. In the same way, if you have factors A, B, C, then A:B should not be interpreted if A:B:C is significant.

Thus, when you have a 5-way interaction, none of the lower-order interactions can be sensibly interpreted. Therefore, if I understand you correctly and you have interpreted your lower order interactions, you should probably not continue along those lines.

Rather, what you can do is to split up your data set and continue to analyze factor levels of your data set separately. Which of the factors you use to split up the dataset is arbitrary, but often it is very useful to split up the data for each variable and assess what you see. In your example, you might start with sex, and calculate an ANOVA for males, and another one for females (each ANOVA contains the 4 remaining factors). Just as well, you could split up the data according to ethnicity (one ANOVA for Asian, one for Caucasian). You could also split up by one of the within-subject factors.

I will assume that you have decided to split the data by sex (just to continue with the example here). Then, assume that for males, you get a 4-way interaction. You would then go on to split up the male data by one of the remaining variables (say, ethnicity). You would then calculate ANOVAs for male Asians (over the remaining 3 factors), and for male Caucasians.

Importantly, if you get only a lower-order interaction, then you are only "allowed" to analyze these further. This is because the other factors did not show significant differences. Thus, if your males ANOVA gives you only a 2-way interaction, then you would average over the other factors and calculate only an ANOVA over the 2 interacting factors (and, because we are in the male part of the ANOVAs, this would be for the males alone).

For the females, everything may look different, and so the decision which follow-up ANOVAs to calculate is separate for this group. So, what you did for males should be done for females in the same way ONLY if you got the same interactions.

Thus, you will potentially have a lot of ANOVAs, and it might not be easy to decide which ones to report. You should report 1 complete line down from the hightest interaction to the last effects (possibly t-tests to compare only 1 of your factors at the end). You should not usually report several lines (e.g., one starting the split-up by sex, then another one starting by ethnicity). However, you must report a complete line, and cannot simply choose to report only some of the ANOVAs of that line. So, you report one complete analysis, not more, not less. Which way to go in terms of splitting up / follow-up ANOVA is a subjective decision (unless you have clear hypotheses you can follow), and might depend on which results can be understood best etc.

Solved – Running several one-way ANOVA tests on different groups of the same data without inflating type I error

Running the full model with the interaction will be informative as it will be able to tell you if the performance across the three industries is different between the three countries. This together with plots of the data will tell you if it would be interesting to do post-hoc tests/contrasts that need be be corrected to adjust for the additional error.

You could do this in R as follows:

lm1 <- lm(performance ~industry*location, data=DATA)
lm2 <- lm(performance ~industry+location, data=DATA)
anova(lm1,lm2)
library(effects)
plot(effect("industry*location", lm1))

The anova and the plot suggests there is no difference between the three countries in performance across the industries (for this random data example):

Model 1: performance ~ industry * location
Model 2: performance ~ industry + location
  Res.Df    RSS Df Sum of Sq      F Pr(>F)
1    111 104.08                           
2    115 108.19 -4   -4.1008 1.0933 0.3635

Running separate models for the three different countries is easy with the phia package, which will automatically adjust for doing multiple tests. For example, determining if industry is different for each country you can do:

custom.contr <- contrastCoefficients(location ~ USA, location ~ Mexico, 
                location ~ Canada, industry ~ Basics - Financials - Energy, 
                data=DATA, normalize=TRUE)
names(custom.contr$location) <- c("USA", "Mexico","Canada")
names(custom.contr$industry) <- c("industry")
testInteractions(lm1,custom=custom.contr)

Which will show you there is no difference between the three countries:

F Test: 
P-value adjustment method: holm
                     Value  Df Sum of Sq      F Pr(>F)
   USA : industry  0.53426   1     1.713 1.5093 0.6655
Mexico : industry -0.04385   1     0.035 0.0305 0.8617
Canada : industry  0.29764   1     1.063 0.9369 0.6703
Residuals                  111   125.949

Best Answer

Related Solutions

Solved – Interpreting 5-way Mixed Model ANOVA

Solved – Running several one-way ANOVA tests on different groups of the same data without inflating type I error

Related Question