Solved – Why is it useful to sample probability distributions models

distributionsmachine learningprobabilitysampling

I was reading the book Deep Learning by Goodfellow, Bengio & Courville and it seemed to imply that samples are only useful to:

  1. approximate sum/integrals (why is this sooo important?)
  2. when the goal is to generate samples itself (which is trivial since of course its important to sample if that is the goal)

However, I remain unable to appreciate why sample is so important and why so much hard work has gone to study such a topic. Why is sampling important? Are there no other motivations? Are these really important enough on its own?

I'd love to be able to appreciate why sampling is an important topic.


My own thoughts

As someone inclined for ML, minimizing the expected loss is my goal:

$$ E_{x,y \sim p*_{x,y}} [Loss(f(X),Y)]$$

where $p^*$ is the true unknown distribution.

so I guess since this expectation is a sum or an integral we could try approximating the true generalization of our model if we could create more samples or create a model of the true distribution. This seems important, though it seems that this is not the approach people do for ML for some reason…


To provide further context on the exact extract I was reading from the deep learning book on the chapter on sampling (and Monte Carlo Methods) here is exact paragraph I was reading, title:

Why Sampling:

There are many reasons that we may wish to draw samples from a
probability distribution. Sampling provides a flexible way to
approximate many sums and integrals at reduced cost. Sometimes we use
this to provide a significant speedup toa costly but tractable sum, as
in the case when we subsample the full training costwith minibatches.
In other cases, our learning algorithm requires us to approximatean
intractable sum or integral, such as the gradient of the log partition
function ofan undirected model. In many other cases, sampling is
actually our goal, in thesense that we want to train a model that can
sample from the training distribution. (Chapter 17)

for me just reading that section equates to "drawing samples (from a model) is only useful to approximate sums/integrals and when you want to do sampling". For someone with much less of a statistics background, this justification seems quite shallow. I have seen a lot of mathematics and textbooks (like Koller's PGM book) devoted to sampling from models. This seems quite an important topic and it just seems that this book lacked a proper motivation for the why. This is where my question stems from.

Best Answer

Often enough, we are not only interested in evaluating an integral, e.g., for calculating the expectation of a random variable, but in understanding the entire distribution, say for deriving quantiles. Which is important, e.g., in inventory control. And often enough, the underlying distribution is not analytically tractable, so sampling is the easiest and fastest way of going about this.

A simple example from my daily life: suppose we want to have a quantile forecast for retail sales. We believe than each day's sales are negative binomially distributed with known (forecasted) parameters. However, we don't need quantiles per day, but across, say, three or five days (because the truck arrives to fill up the store shelves twice a week, so each delivery has to cover multiple days). The sum of negbins is not analytically tractable, but it's trivial to simulate from each day's negbin, add the simulated values and get an appropriate quantile from the simulated sum that will achieve our desired service level.

(Plus, there are lots of other applications in cryptography etc. if you are really interested in why people invest so much effort in .)

Related Question