Does “probability distribution” refer to the PDF or CDF

measure-theoryprobabilityprobability distributionsprobability theorystatistics

In my work and studies, I keep coming across statements that are similar to the following:

Quote from one source:

A better angle, at least from the perspective of GANs, is to define similarity in
the sense of probability distribution. Two data sets are considered similar if they
are samples from the same (or approximately same) probability distribution. Thus
more specifically we have our training data set X ⊂Rn consisting of samples from a
probability distribution μ (with density p(x)), and we would like to find a probability
distribution ν (with density q(x)) such that ν is a good approximation of μ. By taking
samples from the distribution ν we obtain generated objects that are “similar” to the
objects in X.

Quote from a different source:

The graph of a continuous probability distribution is a curve. Probability is represented by area under the curve. We have already met this concept when we developed relative frequencies with histograms in Chapter 2. The relative area for a range of values was the probability of drawing at random an observation in that group. Again with the Poisson distribution in Chapter 4, the graph in Example 4.14 used boxes to represent the probability of specific values of the random variable. In this case, we were being a bit casual because the random variables of a Poisson distribution are discrete, whole numbers, and a box has width. Notice that the horizontal axis, the random variable x, purposefully did not mark the points along the axis. The probability of a specific value of a continuous random variable will be zero because the area under a point is zero. Probability is area. The curve is called the probability density function (abbreviated as pdf). We use the symbol f(x) to represent the curve. f(x) is the function that corresponds to the graph; we use the density function f(x) to draw the graph of the probability distribution.

I have been trying to really understand at more than a surface level the mathematics behind generative models, but a roadblock has been trying to determine what authors mean by "probability distribution". I am fairly certain that "probability distribution" in the first quote is referring to the CDF, since the author specifically denotes the density as p(x), which I think refers to the PDF. In the second quote, the author portrays "the curve" is a probability distribution, and taking an integral over it results in a probability, obviously conveying it is the PDF.

Maybe "probability distribution" in fact has no concrete definition and the reader is left to figure out what it is referring to themselves, but it would make my life a lot easier if I knew it referred to one or the other.

Best Answer

A probability distribution is just about anything that defines the likelihood of certain outcomes from an experiment. That can be defined in different ways, including the probability density function (PDF) for continuous variables, or the probability mass function (PMF) for discrete variables, or the cumulative distribution function for either continuous or discrete variables. If it's scaled properly and can be used to determine the likelihood of any possible outcome, it is some kind of probability distribution.

In practice, I would typically read "probability distribution" as referring to the non-cumulative density or mass function, but it's not a precise term. One would be reasonably well understood if they refer to a "normal probability distribution", although the more precise term would be a "normal probability density function". Probability density functions often come in common shapes like normal, uniform, Weibull, binomial, and others which are easily referred to by name, and referring to those distributions typically calls to mind an image of the probability density function rather than the cumulative distribution function - when someone refers to the "normal probability distribution", I'd wager that most people will typically think of a Gaussian bell curve PDF and not the sigmoidal CDF. Either one are equally valid representations of the same probability distribution, though - the underlying distribution is not affected by how one chooses to represent it.