[Math] Discussion on definition of independent random variables

independencemeasure-theoryprobability theory

Definition of Random Variable: Suppose $(\Omega,\Sigma,\mathbb P)$ is a probability space. If $\mathbf Y : \Omega \mapsto \mathbb R$ is measurable w.r.t. the Borel $\sigma$-algebra on $\mathbb R$, then $\mathbf Y$ is called a Random Variable. The $\sigma$-algebra generated by $\mathbf Y$ is: $\sigma(\mathbf Y)=\{ \mathbf Y^{-1}(A):A\in\mathscr B(\mathbb R)\}\subset\Sigma.$

Definition of Independent Random Variables: Two random variables $\mathbf X$, $\mathbf Y$ are independent if events A and B are independent (i.e. $\mathbb P(A\cap B)=\mathbb P(A)\mathbb P(B)$), whenever $A\in\sigma(\mathbf X)$ and $B\in\sigma(\mathbf Y)$.

Question 1: Why do we only consider $\mathscr B(\mathbb R)$ here? The measurable collection of set on $\mathbb R$ could be extended beyond the Borel sets if we give a measure (e.g. Lebesgue measure). Although the extension is to include null-sets, we get a lot more sets (by intersection/union of null-sets with Borel sets), why do we do not require $\mathbf Y$ to be measurable over them as well? Plus, even if N is a null-set on $(\mathbb R,\mathscr A,\mu)$, it does not mean $\mathbf Y^{-1}(N)$ is a null-set on $(\Omega,\Sigma,\mathbb P)$ – it could has a positive probability, and thus is not boring/trivial.

Question 2: When it comes to independence of $\mathbf X$ and $\mathbf Y$, why do we only consider events in $\sigma(\mathbf X)$ and $\sigma(\mathbf Y)$? Why do we not care about events such as $L\in(\Sigma\setminus\sigma(\mathbf X))$ and $M\in(\Sigma\setminus\sigma(\mathbf Y))$? (i.e. $\mathbb P(L\cap M)\neq\mathbb P(L)\mathbb P(M)$ does not affect the independence of $\mathbf X$ and $\mathbf Y$). Note that for $L\in(\Sigma\setminus\sigma(\mathbf X))$ and $M\in(\Sigma\setminus\sigma(\mathbf Y))$ with $\mathbb P(L\cap M)\neq\mathbb P(L)\mathbb P(M)$, although $L\notin\sigma(\mathbf X)$, it is still somehow relevant to $\mathbf X$, because $\mathbf X$ is defined on every point of the set L, and maps those points to $\mathbb R$. For example, there could be set $P\in\sigma(\mathbf X)$ and $P\cap L\neq\emptyset$, so X could be defined on some points of $L$ despite of the fact that $L\notin\sigma(\mathbf{X})$.

UPDATE: summarize answers + my thoughts

Big thanks for both answers, very helpful!

For Question 1: To summarize the answers: examples were given about a mapping between probability space (which is e.g. $(\mathbb R,\mathscr M,\mathcal{L})$) to $\mathbb R$. And that mapping will map a $\mathcal{L}$-measurable set inversely back to a non-$\mathcal{L}$-measurable set in probability space. For example (from Halmos's Measure theory §19. Problem 3.) let f(x) = 0.5*(x + c(x)), where c(.) is a Cantor function. And then let $C$ be the Cantor set, $\exists A \subset C$ s.t. $A$ is $\mathcal L$-measurable but not a Borel set, and $F^{-1}(A)$ is not $\mathcal L$-measurable (Refer to this post). Extended discussion: the above issue could be solved by two ways: 1) we give up Lebesgue sets but go with Borel sets (which is exactly our definition); but we could also do 2) Maintain Lebesgue sets, but give up those mappings as a valid random variable. In terms of why is it important to keep functions like above (i.e. maps Lebesgue measurable sets inversely back to a non-measurable set) to be a random variable, and thus we eventually choose 1) over 2)? – Please see @zoli's comments below

For Question 2: To summarize the answers: The parity example by @zhoraster is a good one, for illustrating "independence meaning probability able to multiply" v.s. "normal meaning of generally not dependent"), but this is NOT the issue confusing me. What really bothers is that the property (independence) of v.r. is defined based on $\mathit subsets$ of $\Omega$; yet v.r. itself based on $\mathit elements$ of $\Omega$. @zoli mentions wiki does give definition of independence with intervals and cdf, but per @Did's comment, and I also personally feel that, independence itself does not necessarily need to rely on those concepts – instead, could be just by the independence of $\sigma$-algebra generated by the r.v. Extended discussion: I think @zoli's comment could be close, that "Events not belonging to the $\sigma$−algebra generated by a random variable cannot be described using only statements about the random variable." – My own thoughts on this: although we define $\mathbb P$ over $\Omega$ for all elements in $\Sigma$, what could really be carry through the r.v. $\mathbf X$ is $\mathbb P_X(A)=\mathbb P(\{\omega|\mathbf X(\omega)\in A\})$, where $A\in\mathscr{B}(\mathbb R)$, thus set $L$ or $M$ above might have a measure under $\mathbb P$, we cannot reflect that via $\mathbb{P}_X$ or $\mathbb{P}_Y$. Further, if we do care about those events, we'll pick a r.v. $\mathbf Z$ s.t. $L\in\sigma(\mathbf Z)$. BTW, from wiki: "the underlying probability space $\Omega$ is a technical device…In practice, one often…just puts a measure on ${\mathbb {R} }$…".

Best Answer

EDITED

On the definition of measurability of real valued functions

The serious question is related to the definition of measurability: Let's condider the measurable space $[\Omega,\mathcal A]$ and the real valued function $f$. Why do we demand only that the inverse images of Borel sets be in $\mathcal A$?

Answer:

Let $[\Omega,\mathcal A]=[\mathbb R, \mathcal L]$ and let $$f:[\mathbb R, \mathcal L]\to \mathbb R.$$

And let's define the Lebesgue measurability based on the inverse images of Lebesgue sets.

There is a Caveat. Furthermore, there exists at least one function and there exists at least one Lebesgue measurable set whose inverse image is not Lebesgue measurable. (See also: Halmos's Measure theory §19. Problem 3. and this Post.) With the inverse images of Borel sets such problems do not arise.

Independence of real random variables

The definition of independence of real valued random variables is independent of either the Borel measurable sets or the Lebesgue measurable sets. It is Based on the inverse images of intervals like

$$ \ (-\infty,x]\ \text{ or }\ (-\infty,x).$$

This is because the independence of random variables depends only on the behavior of their cdf's (common and individual) and $f$.

(The theorem behind the statements above claims the equivalence of the two definitions of measurability in the case of real valued functions (1): with inverse images of Borel sets and (2) with intervals like above. See Halmos, §18, Theorem A.

Events not belonging to the $\sigma-$ algebra generated by a random variable cannot be described using only statements about the random variable. So the probability of those events will not arise when talking about independence.

Related Question