[Math] Populations and Sample Spaces

statistics

I have been reading the book "Introduction To Statistics, A Customised Edition of Statistics For Business and Economics" by P. Newbold et al.

As I understand their statement of classical probability:

Classical probability is the propertion of times that an event will occur assuming that all outcomes in are sample space a equally likely to occur…

So as I understand it, we assume each outcome is equally likely, so for, say a die roll my sample space, the set of all possible outcomes of a random experiment, is {1, 2, 3, 4, 5, 6}.

Ie.

$$
P(A)=\frac{number\ of\ outcomes\ relevant\ to\ event\ A}{total\ number\ of\ outcomes} = \frac{N_A}{N}
$$

My first question had been:

  1. Is it correct to say that we have to assume each outcome is equally
    likely?

My thinking had been this: So if I want the probability I roll a 6, for example, the number of relevant outputs is len({6}) = 1 and the total number of outcomes is len({1,2,3,4,5,6}) = 6, so I would say P(6) is 1/6.

But what if my dice is unfair. How can I represent this in classical probability? The number of outcomes remain the same… what tools does classic probability have to allow for this? Or is this where relative frequency probability is required?

The reason I was asking is that I couldn't see how we could, using classical probability, model a biased dice for example. I think now that it can't and that to model events of unequal probability we need relative frequency probability.

All I was asking for was confirmation that my thinking on this was correct. It seems there has been quite dome disagreement on it, but above is a verbatim quote from the book.

So… given this, the book also makes the following definitions:

Sample Space = The set of all possible basic outcomes from a random experiment.

Population = The complete set of items or "events" of interest. Size is very large, denoted N, possibly infinite.

So my next question had been,

  1. Is the "sample space" in this case (for classical probability) also the population? .

Still not sure if sample space is the same as population: I have understood "sample space" as "the set of all possible basic outcomes from a random experiment." I have understood population as "the complete set of items or "events" of interest"… so surely all possible results of an experiment must be the complete set of items of interest, so the two terms are equivalent? This seems wrong to me, but I don't have a grasp as to why?

In fact even in relative frequency probability, how do these two terms really differ.

Best Answer

If we define the population as the complete set of items or "events" of interest.

And

We define the sample space as the set of all possible outcomes (exhaustively) from a random experiment.

Then, I wondered this...

Take a dice roll. The population is the complete set of possible items {1, 2, 3, 4, 5, 6}. The sample space is the set of all possible outcomes, also {1, 2, 3, 4, 5, 6}. So here sample space and population appear to be the same thing, so when are they not and what are the distinguishing factors between the two??

The WikiPedia page on sample spaces caused the penny to drop for me:

...For many experiments, there may be more than one plausible sample space available, depending on what result is of interest to the experimenter. For example, when drawing a card from a standard deck of fifty-two playing cards, one possibility for the sample space could be the various ranks (Ace through King), while another could be the suits (clubs, diamonds, hearts, or spades)...

Ah ha! So my population is the set of all cards {1_heart, 2_heart, ..., ace_heart, 1_club, ...} but the sample space may be, if we are looking for the suits, just {heart, club, diamond, spade}. So the population and sample space are different here.

In summary the population is the set of items I'm looking at. The sample space may or may not be the population... that depends on what question about the population is being asked.

This answers the latter half of my question... possibly I didn't ask it clearly enough or it was just too obvious (it just took a while for it to sink into my head)

The other half of the question is answered by "Qwerty". In all the sources I've looked at it appears classical probability treats events as equally likely. [1] [2] (and the book I referenced in the Q). "Qwerty" has expanded it slightly... but I believe this is where relative frequency probability comes into play and allows us to model "unfair" (not equally likely) events: From [2]

The probability of an event is the ratio of the number of cases favorable to it, to the number of all cases possible when nothing leads us to expect that any one of these cases should occur more than any other, which renders them, for us, equally possible.

And

The classical definition of probability was called into question by several writers of the nineteenth century, including John Venn and George Boole.2 The frequentist definition of probability became widely accepted as a result of their criticism

Related Question