I am not entirely convinced with the line the sample space is also called the support of a random variable
That looks quite wrong to me.
What is even more confusing is, when we talk about support, do we mean that of $X$ or that of the distribution function $Pr$?
In rather informal terms, the "support" of a random variable $X$ is defined as the support (in the function sense) of the density function $f_X(x)$.
I say, in rather informal terms, because the density function is a quite intuitive and practical concept for dealing with probabilities, but no so much when speaking of probability in general and formal terms. For one thing, it's not a proper function for "discrete distributions" (again, a practical but loose concept).
In more formal/strict terms, the comment of Stefan fits the bill.
Do we interpret the support to be
- the set of outcomes in Ω which have a non-zero probability,
- the set of values that X can take with non-zero probability?
Neither, actually. Consider a random variable that has a uniform density in $[0,1]$, with $\Omega = \mathbb{R}$.
Then the support is the full interval $[0,1]$ - which is a subset of $\Omega$. But, then, of course, say $x=1/2$ belongs to the support. But the probability that $X$ takes this value is zero.
It may be pertinent to frame the answer defining a random variable as a measurable function $X$ from the sample space of a probability space $(\Omega, \mathcal F, \mathbb P)$ to the space $(\mathbb R, \mathcal B)$ of the real numbers endowed with the Borel sigma algebra (i.e. the smallest sigma algebra which contains the open sets of $\mathbb R$), such that
$$X^{-1}(A)=\{\omega \in \Omega \mid X(\omega)\in A\}\in \mathcal F \;\; \forall A \in \mathcal B.$$
Constructing the sigma-field (or sigma-algebra) generated by two random variables will bring home the idea as follows:
The random experiment is tossing a fair die once, which has as possible outcomes or sample space $\Omega=\{1,2,3,4,5,6\}.$
It is up to the definition of the probability space to determine which events will have assigned probabilities. We could consider the events even or odd outcome and contrapose it to the events prime or not-prime. Thus we define two random variables: $X: \Omega \to \{0,1\}$ maps the outcome to $1$ if the die lands on an odd number, and otherwise $0.$ On the other hand, the second random variable $Y: \Omega\to \{0,1\}$ maps to $1$ when the die lands on a prime number.
Now these two random variables will generate two different sigma algebras. In the first case, the pre-image of the random variable $X$ will include the set $V=\{1,3,5\}$ corresponding to the value of the random variable $1$ in the Borel set $A_1=\{1\}:$
$$V=X^{-1}(A_1)=\{\omega\in\Omega \mid X(\omega) \in A_1\}=\{1,3,5\}$$
This reads: The pre-image (not the inverse function!) of the set $A_1$ for the random variable $X$ is the set $\{\text{...}\}$ containing those outcomes ($\omega$) in the big set $\Omega$ such that (that is the little bar $\mid$) the random variable maps to elements in $A_1.$ And the fact that $\{1,3,5\}\in \mathcal F$ (see below) makes the function $X$ measurable.
Logically, there would be the set $W =\{2,4,6\}$ corresponding to the pre-image of the value $0$ in the Borel set $ A_2=\{0\},$ or
$$W=X^{-1}(A_2)=\{\omega\in\Omega \mid X(\omega) \in A_2\}=\{2,4,6\}$$
The sigma algebra will also contain complementary, union and finite intersections, ending up with the sigma algebra $\mathcal F=\{\{1,3,5\} , \{2,4,6\}, \emptyset,\Omega \},$ of which $\{1,3,5\}$ and $\{2,4,6\}$ are called atoms (an atom of $\mathcal F$ is a set $F \in \mathcal F$ such that the only subsets of $F$ which are also in $\mathcal F$ are the empty set $\emptyset$ and $F$ itself). And the probability space will determine (in a fair die) that $\frac12$ is assigned to each one of them by the PMF: $\mathbb P\left(\{1,3,5\}\in \mathcal F\right )= \sum_{\mathbf 1_{1,3,5}} p(\omega_i )=3\frac 1 6.$
Here is the construct:
All Borel sets of interest have pre-images corresponding to sets in the sigma algebra $\mathcal F.$
If instead the events of interest were powers of 2 and powers of 3, the sigma algebra would contain $\small \left\{\emptyset, \Omega,\{1,2,4\},\{3,5,6\},\{1,3\},\{2,4,5,6\},\{1,2,3,4\},\{1,3,5,6\},\{1,2,4,5,6\},\{2,3,4,5,6\}\right\}$, the function simply mapping the face of the die to the numbers $1,\dots,6$ would not be a random variable, because the set formed by the pre-image of the singleton Borel set $\small \{5\}$ is not in the sigma algebra (not measurable): $\small \text{pre-image}(\{5\})=\{5\}\notin \mathcal F$.
Going back to the random variable $X,$ i.e. (odd, even), it will not contain information allowing separation of $1$ from $3$ for example, because they are both outcomes belonging to the same event ("odd").
The same exercise for $Y$ will result in two atomic sets $\{2,3,5\}$ and $\{1,4,6\},$ and a sigma algebra $\mathcal G=\{\{2,3,5\} , \{1,4,6\}, \emptyset,\Omega \}.$
This random variable $Y,$ in contradistinction to $X,$ contains information allowing us to separate the outcomes $1$ and $3.$
Best Answer
The measure-theoretic restriction is to avoid pathological situations where $X \ge c$ can't be assigned a probability because the set $\{\alpha: X(\alpha) \ge c\}$ is not measurable.
Although in principle the random variable $X$ is a function on the sample space $\Omega$, in practice probabilists rarely think of it that way, in fact they often don't bother spelling out what the sample space is.
Your proof makes no sense because $X Y$ is not a composition, it is just the ordinary product of real functions: $(XY)(\alpha) = X(\alpha) Y(\alpha)$.