ANOVA Analysis – Understanding the Significance of P-Values in Python

anovaf-testp-valuepythonscipy

I am working with a sample of data for three different groups, and the times they spent in a line at Store B. The groups are A, B, and C, and I want to determine if those three groups' mean time spent in line at a store is equal or not. I believe this would be an ANOVA analysis and below is my thought process of how I implemented it in Python.

import numpy as np
import pandas as pd
from scipy.stats import f_oneway 
Null Hypothesis

𝐻0 : The mean times spent in line at Store B for A, B, and C are equal.

Alternative Hypothesis

𝐻𝑎 : At least one of the mean times spent in line at Store B for A, B, and C is unequal.
Significance level is alpha = 0.05
Values for my samples
A = [7.13, 8.73, 3.65, 7.02, 4.39, 9.49, 5.41, 7.16, 3.91, 10.50, 5.65, 5.08, 7.46, 6.71, 8.47, 5.86]

B = [5.25, 10.71, 6.03, 7.81, 5.37, 8.30, 6.01, 6.79, 7.27, 5.42, 9.12, 4.68, 5.26, 3.68, 3.30, 5.40, 4.94]

C = [4.40, 4.75, 5.86, 6.27, 6.18, 1.65, 7.16, 7.23, 8.08, 6.47, 6.41, 6.70, 3.88, 5.74, 5.15, 7.07, 6.20]
#Calculating p-value
test_statistic, p_value = f_oneway(A, B, C)
print("tstat = ", test_statistic, ", p-value = ", p_value)
tstat =  0.8543992770006822 , p-value =  0.43204138694325955
#How does p_value compare to alpha?
if p_value < alpha:
    print("The p-value is less than alpha, so the null hypothesis IS rejected.")
else:
    print("The p-value is greater than alpha, so the null hypothesis IS NOT rejected")
The p-value is greater than alpha, so the null hypothesis IS NOT rejected

As far as I can tell, I implemented everything correctly. How the p-value ended up being so high. When you take the means of A, B, and C, they aren't equal to each other. I'm not sure what my gap is in my understanding of ANOVA. Would anyone be able to explain how this makes sense that the null hypothesis IS NOT rejected?

Best Answer

It looks like the means of your data are different, just not enough to be statistically unlikely under the null hypothesis. I can confirm that R returns the same results with your data, and eyeballing the boxplots below show that the variance between groups is about the same as the variance within each group, producing an intermediate p-value:

# Convert raw data to R format
A = c(7.13, 8.73, 3.65, 7.02, 4.39, 9.49, 5.41, 7.16, 3.91, 10.50, 5.65, 5.08, 7.46, 6.71, 8.47, 5.86)

B = c(5.25, 10.71, 6.03, 7.81, 5.37, 8.30, 6.01, 6.79, 7.27, 5.42, 9.12, 4.68, 5.26, 3.68, 3.30, 5.40, 4.94)

C = c(4.40, 4.75, 5.86, 6.27, 6.18, 1.65, 7.16, 7.23, 8.08, 6.47, 6.41, 6.70, 3.88, 5.74, 5.15, 7.07, 6.20)
boxplot(A, B, C)

enter image description here

Here's the R code I used:

library(tidyverse)
rbind(
  cbind("A", A),
  cbind("B", B),
  cbind("C", C)
) %>%
  as.data.frame() %>%
  set_names(c("group", "value")) %>%
  mutate(value=as.numeric(value)) %>%
  aov(formula = value~group) %>%
  summary()

And the output I get:

            Df Sum Sq Mean Sq F value Pr(>F)
group        2   5.68   2.838   0.854  0.432
Residuals   47 156.10   3.321               
Related Question