Aspects of probability space and random variable

probability theory

I am trying to make sense of various aspects of a probability space and a random variable and how they relate, and would like some help to tie it all together/ verify my understanding this far:

Consider a probability space $(\Omega, \mathcal{F,P})$, that models a random experiment, for example: flipping a fair coin twice.

A sample space $\Omega$ contains objects (could be "non-mathematical" objects), also referred to as outcomes of the random experiment, for example: $\Omega = \{HH, HT, TH, TT\}$

A sigma algebra $\mathcal{F}$ (of our choice) contains subsets of $\Omega$ called events i.e. subsets of "non-mathematical" objects, for example: $\mathcal{F}=\{\varnothing,\{TT\},\{HT,TH,HH\},\{HH,HT,TH,TT\}\}$, or, another example: $\mathcal{F}=\mathscr{P}(\Omega)$ (i.e. the power set of $\Omega$).

A probability measure $\mathcal{P}$ (of our choice) assigns probabilities to the events i.e. assignes probabilities to the subsets of "non-mathematical" objects in $\mathcal{F}$,
for example: $\mathcal{P}$ could be defined as: $P(\{\varnothing\})=0$, $P(\{TT\})=0.25$, $P(\{HT,TH,HH\})=0.75$, $P(\{HH,HT,TH,TT\})=1$.
(In the case of a fair coin).

A random variable $X$ (of our choice) maps some or all of the events in $\mathcal{F}$ to values (in a measurable space $(E,\mathcal{E})$), for example: a random variable $X$ has the following mappings: $\{TT\}$ is mapped to 1, $\{HT,TH,HH\}$ is mapped to 2, $\{HH,HT,TH,TT\}$ is mapped to 3.

And, the probabilities given to the events in $\mathcal{F}$ by $\mathcal{P}$ are "transfered"/ "pushed forward" to the values of $X$ in $(E, \mathcal{E})$, for example: $P({TT})=0.25 = f(x=1)$, where $f(x)$ is the pmf of $X$.

Clarification 1: Given any combination of mutually exclusive events in $\mathcal{F}$ there should exist an associated probability distribution, which should be the "same" as the probability distribution of a random variable that assigns values to that specific combination of mutually exclusive events, right?

Clarification 2: And as such the probability measure $\mathcal{P}$ in regard to a combination of mutually exclusive events in $\mathcal{F}$ should be the "same" as the pmf/pdf of a random variable that maps those events to values, only the inputs to $\mathcal{P}$ are events instead of values (like with the pmf/pdf). Correct?

Best Answer

You can specify the same event in multiple ways. For example suppose the sample space is the set of outcomes of two flips of a coin: $$ \Omega = \{HH, HT, TH, TT\}$$ The sigma-algebra $\mathcal{F}$ is the set of all subsets of $\Omega$. Now define a random variable $Y:\Omega\rightarrow\mathbb{R}$ as the number of heads: $$ Y(HH) = 2 , Y(HT) = 1, Y(TH)=1, Y(TT)=0$$ The following are different ways of specifying the same event: $$\{\mbox{there are 2 heads}\} = \{HH\} = \{\omega \in \Omega : Y(\omega)=2\} = \{Y=2\} $$ All of these are the same event and so of course: $$P[\mbox{there are 2 heads}]= P[\{HH\}] = P[\{\omega \in \Omega : Y(\omega)=2\}] = P[Y=2]$$ Because these are all the same event, there is no need to “redefine” or “push-forward” anything.

If you want to define a function $f:\{0, 1,2\}\rightarrow\mathbb{R}$ by $f(y) = P[Y=y]$, then nobody will stop you. Since the events $\{Y=0\}$, $\{Y=1\}$, $\{Y=2\}$ partition the sample space (they are disjoint events and their union is the whole sample space), the third axiom of probability ensures $$P[Y=0]+P[Y=1]+P[Y=2]=P[\Omega]$$ and we also know $P[\Omega]=1$. If you want to write the above equation using your function $f$, then indeed $f(0)+f(1)+f(2)=1$.

If for some reason you want to define a new sample space $\mathcal{Y} = \{0,1,2\}$, sigma algebra $\tilde{\mathcal{F}}$ being the set of all subsets of $\mathcal{Y}$, and probability measure $\tilde{P}[A]$ defined for all $A \subseteq \mathcal{Y}$ by $$\tilde{P}[A] = \sum_{y \in A} f(y)$$ then nobody will stop you (and you can verify this satisfies the three axioms of probability).

More generally, if $\mathcal{Z}$ is any finite or countably infinite set and if $g:\mathcal{Z}\rightarrow\mathbb{R}$ is a function that satisfies $g(z)\geq 0$ for all $z \in \mathcal{Z}$ and $\sum_{z \in \mathcal{Z}}g(z)=1$, then defining the sigma algebra as the set of all subsets of $\mathcal{Z}$ and defining $P:2^{\mathcal{Z}}\rightarrow \mathbb{R}$ by $$ P[A] = \sum_{z \in A} g(z) \quad \forall A \subseteq \mathcal{Z}$$ yields a valid probability measure, meaning that all three axioms of probability hold.

Related Solutions

[Math] Domain of a random variable – sample space or probability space

Your observation is reasonable, but your suggested cure for the problem, making $(\Omega,\mathcal F,P)$ the domain of $X$, won't work because the domain of a function needs to be a set (or a type or something like that). My impression is that, when people refer to a function $X:\Omega\to\mathbb R$ as a random variable, they always do so in the context of a probability-space structure ($\mathcal F$ and $P$) on $\Omega$. If no such structure is given, then I wouldn't call $X$ a random variable. And if there is uncertainty about which structure is intended, then, as you said, notions like "distribution" of $X$ will not be well-defined.

If I had to formalize the notion of random variable, in Bourbaki style, I would probably say that a random variable is a pair consisting of a probability space $(\Omega,\mathcal F,P)$ together with a function $X:\Omega\to\mathbb R$. As with many mathematical concepts, one often omits mentioning part of an entity (in this case the probability space) when it is understood from the context.

Probability Theory – Precise Definition of the Support of a Random Variable

I am not entirely convinced with the line the sample space is also called the support of a random variable

That looks quite wrong to me.

What is even more confusing is, when we talk about support, do we mean that of $X$ or that of the distribution function $Pr$?

In rather informal terms, the "support" of a random variable $X$ is defined as the support (in the function sense) of the density function $f_X(x)$.

I say, in rather informal terms, because the density function is a quite intuitive and practical concept for dealing with probabilities, but no so much when speaking of probability in general and formal terms. For one thing, it's not a proper function for "discrete distributions" (again, a practical but loose concept).

In more formal/strict terms, the comment of Stefan fits the bill.

Do we interpret the support to be

- the set of outcomes in Ω which have a non-zero probability,
- the set of values that X can take with non-zero probability?

Neither, actually. Consider a random variable that has a uniform density in $[0,1]$, with $\Omega = \mathbb{R}$. Then the support is the full interval $[0,1]$ - which is a subset of $\Omega$. But, then, of course, say $x=1/2$ belongs to the support. But the probability that $X$ takes this value is zero.

Best Answer

Related Solutions

[Math] Domain of a random variable – sample space or probability space

Probability Theory – Precise Definition of the Support of a Random Variable

Related Question