Solved – Reading box-and-whisker plots: possible to glean significant differences between groups

anovaboxplotdata visualization

Suppose we're looking at this box-and-whisker plot:

plot

Between Thursday and Friday, I think most would agree there seems to be a significant difference in time slept. Is that a statistically-valid conjecture, though? Can we discern significant differences due to the fact neither of the inner-quartile ranges overlap between Thursday and Friday? What about the fact that the upper and lower whiskers of Thursday and Friday, respectively, overlap? Does that affect our analysis?

Usually accompanying a chart like this would be some sort of ANOVA, but I'm just curious how much we can say about differences between groups simply by looking at a boxplot.

Best Answer

No, you can't. If you had the sample sizes and a lot of experience you might be able to guess - and the accuracy of your guess would depend on (in addition to the effect size) the sample size. If N = 1,000,000 per group, lots of significance. If N = 10 per group, not so much. At 100 per group it's harder to guess.

I'd argue that that is a good thing. The thing to do with a box plot is not to try to guess at statistical significance but to look at what's going on and try to reason about it. Hmm. More sleeping on weekends. That's interesting but not really surprising. We could model hours of sleep as a function of weekend vs. not. Or we could try to see if this pattern varied. Maybe retired people don't have this pattern? What about shift workers? People who work on the weekends? People who work 7 days a week?

As my favorite professor in grad school (Herman Friedman) used to say: "Stop p-ing on the research!"