I've been trying to figure out nontrivial conditions for continuous functions of continuous random variables to themselves be continuous random variables without much success. Here's what I know so far:
- Continuous functions of continuous random variables are random variables, see this thread
- Continuous functions of continuous random variables needn't be continuous random variables in general, see this counterexample.
- Strictly monotonic functions of continuous random variables are continuous random variables
Are there more general conditions under which a continuous/smooth/analytic function of a continuous random variable is itself a continuous random variable?
Ultimately, what I am after is the following: if $\Omega$ is a continuous random variable with a bounded density function and $f$ is a continuous/smooth/analytic function, then what are some general conditions for the density function of $f(\Omega)$, if it exists, to be bounded?
Edit: As per @Malkin's comments, I want to clarify that by a continuous random variable, I mean a random variable that has a continuous cumulative distribution function (c.d.f.). I am also interested in the case when the c.d.f. is absolutely continuous, see the previous paragraph.
Best Answer
Claim 1:
If $f$ is any function that is constant on some interval $I$ then there exists a continuous random variable $X$ such that $f(X)$ is not a continuous random variable.
Proof:
Suppose $f$ is constant on $I=[a,b]$ with $a \neq b$ and let $X \sim N(0,1)$. Put $$\varepsilon :=P(X \in I)>0$$ $$Y:=f(X)$$ $$F_Y(x)=P(Y \leq x)$$ $$x_0:=f(a)=f(b)$$ Then $\forall \, \delta>0$ we have:
$$ \begin{align} \vert F_Y(x_0)-F_Y(x_0-\delta) \vert &= P \left( Y \in (x_0-\delta, x_0]\right) \\ &= P \left( f(X) \in (x_0-\delta,x_0] \right) \\ &\geq P \left( f(X) =x_0 \right) \\ &\geq P \left( X \in I \right) \\ &= \varepsilon \end{align} $$
Hence $F_Y$ is not continuous at $x_0$ and $f(X)$ is not a continuous random variable.
Claim 2:
If $f$ is any real analytic function that is not constant on any interval $I \subset \mathbb{R}$ then $f(X)$ is a continuous random variable for any continuous random variable $X$.
Proof:
Let $X$ be a continuous random variable with CDF $F_X$ and let $U\subset \mathbb{R}$ be the range of $f$. Define $Y := f(X)$ and let $F_Y$ be the CDF of $Y$, so that $F_Y$ has domain $U$. We will show that $F_Y$ is continuous.
Let $\varepsilon>0$ and $x_0 \in U$.
By simple properties of random variables, $P(\vert X \vert > M) \rightarrow 0$ as $M \rightarrow \infty$. Pick $M$ such that $P(\vert X \vert > M) < \frac{\varepsilon}{2}$.
Now consider $S=f^{-1}(\{x_0\})$. Because $f$ is not constant on any interval, $S$ consists of countably many points: $S=\{s_i\}_{i \in J}$ for some $J \subset \mathbb{N}$.
Define $S':=S \cap [-M,M]$. Suppose $S'$ contains infinitely many points. Then, since $S'$ is bounded, there exists a subsequence $(s_{i_n})_{n \in \mathbb{N}}$ such that $s_{i_n} \rightarrow c$ for some $c \in S'$. Since $f(s_{i_n})=x_0 \, \forall \, n$ by Rolle's Theorem we have a sequence $(r_n)_{n \in \mathbb{N}}$ with $s_{i_n} \leq r_n<s_{i_{n+1}}$ and $f'(r_n)=0 \, \forall \, n$. Also $s_{i_n} \rightarrow c \implies r_n \rightarrow c$. But by this reasoning, such a sequence $(r_n)$ cannot exist for an analytic function $f$. And so $S'$ must only contain finitely many points. Re-label them $S'=\{s'_i\}_{i=1}^N$.
$F_X$ continuous $\implies$ for each $s'_i \, \exists \, \delta_i>0$ s.t. $\vert F_X(x)-F_X(y) \vert < \frac{\varepsilon}{2N} \, \, \forall \, x,y \in (s'_i-\delta_i, s'_i+ \delta_i)$
Consider $f'(s'_i)$. Suppose $f'(s'_i)=0$. Since $f$ is not constant on any interval and since $f'$ is differentiable, $\exists \, \gamma_i>0$ s.t. $f$ is monotonic on $(s'_i,s'_i+\gamma_i)$ and monotonic on $(s'_i-\gamma_i,s'_i)$. If instead $f'(s'_i) \neq 0$ then again $\exists \, \gamma_i>0$ s.t. $f$ is monotonic on $(s'_i,s'_i+\gamma_i)$ and monotonic on $(s'_i-\gamma_i,s'_i)$. (See the answer here for a justification.)
Define $k:=\frac{1}{2}\min\{\delta_i,\gamma_i \}_i$ and $t:=\frac{1}{2} \min\{\vert f(s'_i+k)-f(s'_i)\vert ,\vert f(s'_i-k)-f(s'_i)\vert \}_i$.
Constructing $k$ and $t$ in this way gives us that $(s'_i-k, s'_i+ k) \subset (s'_i-\delta_i, s'_i+ \delta_i) \, \forall \, i$ ; that $f$ is monotonic on $(s'_i-k, s'_i) \, \forall \, i$ and separately on $(s'_i, s'_i+ k) \, \forall \, i $ ; and then that $(x_0-t,x_0+t]=(f(s'_i)-t,f(s'_i)+t] \subset f((s'_i-k,s'_i+k])\, \forall \, i$. These facts will be used in the working below.
Let $x \in (x_0-t,x_0 + t)$. Then:
$$ \begin{align} \vert F_Y(x)-F_Y(x_0) \vert &\leq P \left( Y \in (x_0-t,x_0+t] \right) \\ &= P \left( f(X) \in (x_0-t,x_0+t] \right) \\ &= P \left( X \in f^{-1}((x_0-t,x_0+t]) \right) \\ &\leq P \left( X \in f^{-1}((x_0-t,x_0+t]) \cap [-M,M] \right) + P(\vert X \vert > M) \\ &= P \left( X \in f^{-1}((x_0-t,x_0+t]) \cap [-M,M] \right) + \frac{\varepsilon}{2} \\ &\leq P \left( X \in \bigcup_i (s'_i-k,s'_i+k] \right) + \frac{\varepsilon}{2} \\ &\leq \sum_i \vert F_X(s'_i+k)-F_X(s'_i-k) \vert + \frac{\varepsilon}{2} \\ &\leq \sum_i \frac{\varepsilon}{2N}+ \frac{\varepsilon}{2} \\ &= \frac{\varepsilon}{2} + \frac{\varepsilon}{2} \\ &= \varepsilon \end{align} $$
Hence $F_Y$ is continuous.
We may conclude that $f(X)$ is a continuous random variable.