Solved – Simpson’s Paradox vs. Berkson’s paradox

simpsons-paradox

Can someone explain what is the difference between the two? They seem to me to be identical. In both paradoxes you start from a narrow distribution and find out the correlation switches when you move to the full distribution. So what's the difference actually?

Current Answers Review:

  • Mickybo Yakari points that Berkson's Paradox relates to the (potentially wrongful) sampling of the data. While Simpson's paradox doesn't relate to sampling risks – but rather the analysis of the data (conditioning it on some variable or not).

  • Acccumulation makes the same distinction between selection bias (Berkson's) and categorization bias (Simpson's), and claims that Berkson's can be viewed as a subset of Simpson's.

  • Noah introduces the notion of underlying "truth". In Simpson's the conditioning (or categorization) on a confounder variable reveals the truth, and not doing so is confounding; while in Berkson's the conditioning (or sampling) on a colider variable, hides the truth.

Best Answer

Both Simpson's paradox and Berkson's paradox are statistical phenomena in which a surprising disparity is observed but they differ with respect to the reason they arise. Let's describe them in a few words and determine how they differ.

Simpson's paradox is a statistical phenomenon in which a trend between two variables occurs in several different groups of data, formed according to the values taken on by a conditioning variable, but disappears or reverses when the groups are combined. The disparity lies between the disaggregation-based conclusion and the aggregation-based conclusion and is not caused by a lack of data within any partition subset of the data but rather by the relative sizes of the partition subsets (a matter of calculus of proportions).

Berkson's paradox arises from the fact that the sample is collected in such a way that some individuals of the population (characterised by a conditioning variable) are less likely to be selected than others.

In

Pearl, J. (2013), Linear Models: A Useful "Microscope" for Causal Analysis, Journal of Causal Inference, 1.1, 155-170,

using the language of graphical models, the author explains:

Selection bias is symptomatic of a general phenomenon associated with conditioning on collider nodes[…] The phenomenon involves spurious associations induced between two causes upon observing their common effect, since any information refuting one cause should make the other more probable. It has been known as Berkson Paradox (Berkson, 1946), "explaining away" (Kim and Pearl, 1983) or simply "collider bias".

This is problematic because it may then turn out that, due to the conditioning variable and the entailed biased sampling, the sample accurately represents a certain subset of the population but not the whole population.

Here is an additional paper to further one's understanding of both paradoxes:

Pearl, J. (2014), Understanding Simpson's Paradox, The American Statistician, 1.68, 8-13.

The author notes that Simpson himself noticed that, depending on the story behind the data, the more sensible (Simpson's words), is sometimes compatible with the disaggregated analysis and sometimes with the aggregated analysis. He provides Simpson's classical examples for both.