What's the difference between probability and statistics, and why are they studied together?
Solved – What’s the difference between probability and statistics
mathematical-statisticsprobabilityteaching
Related Solutions
I have hesitated to wade into this discussion, but because it seems to have gotten sidetracked over a trivial issue concerning how to express numbers, maybe it's worthwhile refocusing it. A point of departure for your consideration is this:
A probability is a hypothetical property. Proportions summarize observations.
A frequentist might rely on laws of large numbers to justify statements like "the long-run proportion of an event [is] its probability." This supplies meaning to statements like "a probability is an expected proportion," which otherwise might appear merely tautological. Other interpretations of probability also lead to connections between probabilities and proportions but they are less direct than this one.
In our models we usually take probabilities to be definite but unknown. Due to the sharp contrasts among the meanings of "probable," "definite," and "unknown" I am reluctant to apply the term "uncertain" to describe that situation. However, before we conduct a sequence of observations, the [eventual] proportion, like any future event, is indeed "uncertain". After we make those observations, the proportion is both definite and known. (Perhaps this is what is meant by "guaranteed" in the OP.) Much of our knowledge about the [hypothetical] probability is mediated through these uncertain observations and informed by the idea that they might have turned out otherwise. In this sense--that uncertainty about the observations is transmitted back to uncertain knowledge of the underlying probability--it seems justifiable to refer to the probability as "uncertain."
In any event it is apparent that probabilities and proportions function differently in statistics, despite their similarities and intimate relationships. It would be a mistake to take them to be the same thing.
Reference
Huber, WA Ignorance is Not Probability. Risk Analysis Volume 30, Issue 3, pages 371–376, March 2010.
A Probability Model consists of the triplet $(\Omega,{\mathcal F},{\mathbb P})$, where $\Omega$ is the sample space, ${\mathcal F}$ is a $\sigma$−algebra (events) and ${\mathbb P}$ is a probability measure on ${\mathcal F}$.
Intuitive explanation. A probability model can be interpreted as a known random variable $X$. For example, let $X$ be a Normally distributed random variable with mean $0$ and variance $1$. In this case the probability measure ${\mathbb P}$ is associated with the Cumulative Distribution Function (CDF) $F$ through
$$F(x)={\mathbb P}(X\leq x) = {\mathbb P}(\omega\in\Omega:X(\omega)\leq x) =\int_{-\infty}^x \dfrac{1}{\sqrt{2\pi}}\exp\left({-\dfrac{t^2}{2}}\right)dt.$$
Generalisations. The definition of Probability Model depends on the mathematical definition of probability, see for example Free probability and Quantum probability.
A Statistical Model is a set ${\mathcal S}$ of probability models, this is, a set of probability measures/distributions on the sample space $\Omega$.
This set of probability distributions is usually selected for modelling a certain phenomenon from which we have data.
Intuitive explanation. In a Statistical Model, the parameters and the distribution that describe a certain phenomenon are both unknown. An example of this is the familiy of Normal distributions with mean $\mu\in{\mathbb R}$ and variance $\sigma^2\in{\mathbb R_+}$, this is, both parameters are unknown and you typically want to use the data set for estimating the parameters (i.e. selecting an element of ${\mathcal S}$). This set of distributions can be chosen on any $\Omega$ and ${\mathcal F}$, but, if I am not mistaken, in a real example only those defined on the same pair $(\Omega,{\mathcal F})$ are reasonable to consider.
Generalisations. This paper provides a very formal definition of Statistical Model, but the author mentions that "Bayesian model requires an additional component in the form of a prior distribution ... Although Bayesian formulations are not the primary focus of this paper". Therefore the definition of Statistical Model depend on the kind of model we use: parametric or nonparametric. Also in the parametric setting, the definition depends on how parameters are treated (e.g. Classical vs. Bayesian).
The difference is: in a probability model you know exactly the probability measure, for example a $\mbox{Normal}(\mu_0,\sigma_0^2)$, where $\mu_0,\sigma_0^2$ are known parameters., while in a statistical model you consider sets of distributions, for example $\mbox{Normal}(\mu,\sigma^2)$, where $\mu,\sigma^2$ are unknown parameters.
None of them require a data set, but I would say that a Statistical model is usually selected for modelling one.
Best Answer
The short answer to this I've heard from Persi Diaconis is the following:
The problems considered by probability and statistics are inverse to each other. In probability theory we consider some underlying process which has some randomness or uncertainty modeled by random variables, and we figure out what happens. In statistics we observe something that has happened, and try to figure out what underlying process would explain those observations.