Solved – Do all observations arise from probability distributions

distributionsphilosophicalprobability

Below is the quote from Karl Pearson in the book: “The Lady Tasting Tea: How Statistics Revolutionized Science in the Twentieth Century” by David Salsburg:

Over a hundred years ago, Karl Pearson proposed that all observations arise from probability distributions and that the purpose of science is to estimate the parameters of those distributions. Before that, the world of science believed that the universe followed laws, like Newton’s laws of motion, and that any apparent variations in what was observed were due to errors. Gradually, Pearson’s view has become the predominant one.

My question is on the use of the word observation. Does the above quote imply that any data we collect or observe in nature/physics/experiments arise from probability distribution? How about deterministic process, which surely in not probabilistic? Any expansion of the above quote for lay person would be very helpful.

Best Answer

Statistics is concerned with phenomena that can be considered random. Even if you are studying a deterministic process, the measurement noise can make the observations random. We can simplify many problems by using simple models that considered all the unobserved factors as “random noise”. For example, the linear regression model

$$ \mathsf{height}_i = \alpha + \beta \,\mathsf{age}_i + \varepsilon_i $$

does say that we model height as a function of age and consider whatever else could influence it as “random noise”. It doesn't say that we consider it as completely “random” meaning “chaotic”, “unpredictable”, etc. For another example, if you toss a coin, the outcome would be deterministic and depend only on the rules of physics, but it is influenced by many factors that contribute to its chaotic nature so we can as well consider it as a random process.

If you have a deterministic process and noiseless measurements of all the relevant data, you wouldn't need statistics for it. You would need other mathematics, for example, calculus, but not statistics. If you need to consider the noise and need to assume randomness, you do so. Nothing “arises” from probability distributions, they are only mathematical tools we use to model real-world phenomena.

Related Question