Solved – the difference between the theoretical distribution and the empirical distribution

distributionsmathematical-statistics

Right now I am totally confused as to the difference between these two distributions. I think theoretical means that a given distribution that we already know its all information. However, for the empirical distribution, we also know all information about it. What is the exactly difference between them?

For in example,

In R, dnorm(): Obtain the density values for the theoretical normal distribution; why it isn't an empirical normal distribution?

In R, density(): fit an empirical density curve to a set of values; why in this case, it uses "empirical"?

Best Answer

In an nutshell, when you know what the distribution is and its parameters, you can construct the theoretical distribution.

So, in the case of R, the dnorm command returns the Standard Normal distribution. That is the distribution whose probability density function is: $$ f(x|\mu, \sigma) = \frac{1}{\sigma\sqrt{2\pi}}\, e^{-\frac{(x - \mu)^2}{2 \sigma^2}} $$ and where we know $\mu = 0$ and $\sigma = 1$ so we actually have $$ f(x) = \frac{1}{\sqrt{2\pi}}\, e^{-\frac{x^2}{2}} $$ and $$ P(X \leq x) = \int_0^x \frac{1}{\sqrt{2\pi}}\, e^{-\frac{t^2}{2}}\; dt $$

That's because we start knowing everything.

With the EMPIRICAL distribution we start knowing nothing. What we have is a collection of observations, and we want to try and derive some knowledge from that collection. Perhaps we will fit a distribution, perhaps if we have enough observations, we'll just measure from those.

For example, if I have the following 10 numbers, I can create an empirical distribution: ${1, 2, 3, 4, 4, 5, 8, 9, 9, 10}$

Looking at just these numbers, the empirical probability of choosing a 5 or less is 60%, since I have 6 out of 10 observations of 5 or less.

What density does is run through the collection of observations and fit a kernel-smoothed density to them. It isn't normal, binomial, Poisson, Pareto, or anything in particular necessarily. It is a (sometimes) smoothed version of a histogram which can be treated like a density for calculations relating to the observations. We can try and fit theoretical distributions which are "close" in some way to the empirical. These fitted theoretical distributions can then be used as a proxy and we can use their properties for further fun and games.

Best Answer

Related Solutions

Solved – the difference between dbinom and dnorm in R

Wasserstein Distance – What Does Wasserstein Distance Between Two Distributions Quantify?

Related Question