[Math] How to it be meaningful to add a discrete random variable to a continuous random variable while they are functions over different sample spaces

probabilityprobability theory

It seems that one usually use the set of all possible values of $X$ as the sample space of random variable $X$. (Therefore discrete random variables have countable sample space, continuous random variables have uncountable sample space.) However, I don't think this is right. Since random variables are defined as measurable functions over sample space, the above assumption would make sum/product of a discrete random variable and a continuous random variable meaningless (because they are function over different spaces.)

Well, this post says that

Every uncountable standard Borel space is isomorphic to [0,1] with the Borel σ-algebra. Moreover, every non-atomic probability measure on a standard Borel space is equivalent to Lebesgue-measure on [0,1].

However, this doesn't solve all the problems.

First, that claim is true only for uncountable sample space, so the problem of adding of a discrete RV and a continuous RV is still unsolved.

Second, even with two continuous RV, there are still problems if we consider expectation (integration). Although the quoted claim says that "every non-atomic probability measure on a standard Borel space is equivalent to Lebesgue-measure on [0,1].", however, two different measures cannot be transformed to Lebesgue measure on [0,1] with same isomorphism. Therefore two RVs $X$ and $Y$ may have different measures on [0,1], which will make $\mathbb{E}XY$ meaningless.

This really confused me…. Can people really base all different random variables onto one same sample space with one same measure?

Best Answer

It seems that one usually use the set of all possible values of X as the sample space of random variable X.

Well, your post is an excellent example of why one should not. Actually, the reason why some introductory probability courses in some countries focus so much on the sample space and (even worse) why they recommend to take as sample space the image set of $X$ is a real mystery since:

  • this option is ludicrous in itself,
  • it leads to confusion for anybody actually thinking about the problem,
  • it is completely abandoned (rightly so) in later, higher, courses (not to mention that probabilists themselves refute it).

Terence Tao's citation in Qiaochu Yuan's answer (and t.b.'s very short comment on his own answer) on the page you link to, all say it well so I will be brief. The take-home message is that, in most situations, there exists some probability space $(\Omega,\mathcal F,P)$ such that the family of random variables one is interested in can be defined simultaneously, as long as their joint distribution are compatible. Kolmogorov's consistency theorem gives you that, in a much wider setting than you would actually care for, and this is it. The nature of $(\Omega,\mathcal F,P)$ is left unspecified (although one knows that $[0,1]$ with its Borel sigma-algebra would fit an awful lot of situations) and, more importantly, it is irrelevant. All that counts is that some such probability space $(\Omega,\mathcal F,P)$ exists. Once one knows it does, one can turn to the actual probability questions one wants to solve.

Related Question