Sigma algebras generated by two random variables

measure-theoryprobability theory

I am just a beginner in measure theoruly and probability so apologies is the question is not well posed. Suppose we have two random variables X and Y where $Y=f(x)+\epsilon$. Now, how would the probability spaces $(\Omega,\mathcal{F},\mathbb{P})$ associated with the two random variables look like? I have trouble especially in defining the two sigma algebras since I would guess they would be different or at least $\sigma(Y)$ would be a subset of $\sigma(X)$ but I am not sure of this, especially if we assume that $\Omega \in \mathbb{R}^n$. In this case would the sigma algebras both be the Borel sigma algebra?

Best Answer

In probability theory, there is one fact that appears to be somewhat weird at the beginning, but becomes more natural when working with it a lot: we do not really care about the probability space!

More precisely, there are a lot of different probability spaces to support the same two random variables $X$ and $Y$ and it is not really important how they look, as long as they are ``well-behaved" in some sense. You may think of this as randomness being something that you cannot grasp. The only thing you can actually observe (in the real world) is how this randomness is transported by some random variable. For example, if you toss a die, you cannot tell me where the randomness is, but you can observe how this randomness has turned into a number shown by the die. In this case, the probability space is some abstract object which might be more interesting for philosophers, while mathematicians are interested mainly in how the die (which is the random variable) transports this randomness.

I will now comment more precisely on the question that you asked. Even though the probability space in itself is unimportant, this is not true for the $\sigma$-algebras that are induced by random variables! When doing probability theory, one should think of $\sigma$-algebras as encoding the information that one obtains from an object. Let us come back to the example of the die: the $\sigma$-algebra generated by the die will tell you what possible (abstract) outcomes in the probability space could have produced the outcomes that you have observed. Now, if you observe two independent dice, then the outcome of the one does not provide you with any information on the outcome of the other. In this sense, the two generated $\sigma$-algebras are not comparable. (They still have properties concerning the probabilities of the event, but as sets I would not know how to compare them in general.)

Let us consider a more ``dependent" situation: we have a die where the faces are coloured in blue for even numbers and in red for odd numbers. In particular, if I know the number $N$ shown by the die, I also know its colour $C$. The converse is not true, even though the colour gives me a bit of information. This means that $N$ contains strictly more information than $C$. As you pointed out correctly, this implies that $\sigma(N)$ must be larger than $\sigma(C)$. This can be see very easily if we write $f(n) := blue 1_{\text{n is even}} + red 1_{\text{n is odd}}$ as $C = f(N)$. In particular, if we take any set $A\subseteq \{red, blue\}$, then $$ C^{-1}(A) = N^{-1}(f^{-1}(C)) $$ which lies in the $\sigma$-algebra of $N$.

This observation is true in general: if $Y$ is a deterministic (measurable) function of $X$, then $\sigma(Y)\subseteq \sigma(X)$. One can even show the converse. (If this is not clear to you, I strongly advise you to try proving it!) In your setting, we are in a bit of a different context, as you introduce an error $\epsilon$. This notion is actually not well-defined and will mean different things in different situations. Most often, however, one takes $\epsilon\sim \mathcal{N}(0,\sigma^2)$ for some variance $\sigma^2$, independent of $X$. The last bit is very important and changes the question dramatically. Now, $Y$ is not a deterministic function of $X$ anymore. It is therefore impossible to compare the two $\sigma$-algebra. This is because $\epsilon$ is also a random variable and has its own $\sigma$-algebra associated to it. The only thing that we may say now, is that $Y$ is a deterministic measurable function of both $X$ and $\epsilon$, so that $$ \sigma(Y) \subseteq \sigma(X, \epsilon), $$ where $\sigma(X,\epsilon)$ is the smallest $\sigma$-algebra that contains both $\sigma$-algebras $\sigma(X)$ and $\sigma(\epsilon)$.

Now concerning your last question on the link with Borel $\sigma$-algebras. Usually probability theory is done exclusively on measurable spaces $(\Omega, \mathcal{F})$ where the $\sigma$-algebra $\mathcal{F}$ is the Borel $\sigma$-algebra induced from some (fixed and chosen) topology [and one better makes sure that the topology is Polish, but that is a different story...]. There seems to be a bit of confusion in your question as you state $\Omega \in \mathbb R^n$. However, $\Omega$ is the universe, i.e. the set containing all possible (abstract) outcomes. As such, it would be more appropriate to write $\Omega\subseteq \mathbb R^n$. Next: even though $\mathcal{F}$ is the Borel $\sigma$-algebra, that doesn't mean that $\sigma(X)$ or $\sigma(Y)$ is the Borel $\sigma$-algebra. Consider for example $\Omega = \mathbb R$ and the (very simple) random variable $X: \Omega\rightarrow \mathbb R$ associating to $x$ the constant value $0$. (In particular, $X$ is deterministic and arguably a very boring random variable...) Then, $$ \sigma(X) = \{ \emptyset, \Omega\} $$ is the trivial $\sigma$-algebra which is clearly different from the Borel $\sigma$-algebra. (Again: if this is not clear, try to prove it!) You may also make sure that this argument is actually independent of the (abstract) measurable space $\Omega$, which ties back to the beginning.