Why does a probability function assign each event in the event space a probability? Shouldn’t it assign that probability to outcomes, not events

probabilityprobability distributionsprobability theory

A probability distribution depicts the expected outcomes of possible values for a given data generating process.

An outcome is a possible result of an experiment or trial.

All of the possible outcomes of an experiment form the elements of a sample space.

This means when we look at a probability distribution (like Gaussian or Cauchy) we are looking at the shape made by the different outcome probabilities in the sample space (assuming our distribution captures the entire sample space).

A formal way to describe a random process (e.g. an experiment) is via a probability space.

A probability space consists of three elements:

  1. a sample space;
  2. an event space, and;
  3. a probability function.

We already defined the sample space. An event space is a set of events, and an event is a set of outcomes in the sample space. A probability function assigns each event in the event space a probability, which is a number between 0 and 1.

My question concerns that last line. Wouldn’t it make more sense to say that a probability function assigns each outcome in the sample space a number between 0 and 1? If we look at a probability distribution, aren’t we looking at many individuals values, each between 0 and 1, which are are the outcomes of an experiment (and which all sum to 1)?

Best Answer

We don't need a probability distribution on outcomes if we're only interested in the probability distribution on events. Suppose we're flipping coins, and want to model the probability of both coins showing heads (events HH and not-HH). I can flip the coins many times, see that I get 2 heads in 25% of cases, and conclude there is a 25% chance of the event of getting two heads, and a 75% chance of not getting two heads. Note that this requires knowing nothing at all about the probability distribution over the full set of outcomes (HH, HT, TH, TT) - it's possible I'm flipping two fair coins, or it's possible that I'm flipping a two-headed coin and a biased coin that only shows heads 25% of the time.

It may not be necessary to define the probability distribution over all outcomes.