Asymptotic behavior of a uniform mixture distribution

probability distributionsprobability-limit-theoremsuniform distribution

Let $X = \{x_1= -\alpha, x_2, \ldots, x_n= \alpha\}$ be a set with $x_{i+1} = x_i + \beta$ for some $\alpha, \beta \in \mathbb{R}$.

$Y$ is a random variable that is sampled from a mixture distribution as:
$$Y ~ \sim \sum_{i=1}^n p_i \mathbb{U}[x_i, x_{i+1}]$$

where $\mathbb{U}[x_i, x_{i+1}]$ denotes a uniform random variable which is sampled from the interval $[x_i, x_{i+1}]$.

Let’s pick a distribution, e.g., Gaussian distribution, and let $CDF(x)$ denote the cumulative distribution function value of this distribution at $x$.

My question is the following: let us give weights $p_i = CDF(x_{i+1}) – CDF ( x_{i})$, e.g., the probability given to variable $\mathbb{U}[x_i, x_{i+1}]$ is the density assigned by the Gaussian distribution on interval $[x_i, x_{i+1}]$. Obviously, this is valid when we have $\alpha \rightarrow \infty$. Does the distribution of Y converge to (also) a Gaussian distribution (more generally the distribution used in the CDF), when $\alpha \rightarrow \infty$ and $\beta \rightarrow 0$?

My intuition says yes, but I cannot prove it.

Best Answer

This is true assuming you are free to choose $\alpha, \beta$ however you wish. Convergence in distribution of a sequence of real-valued random variables means their cdfs $F_n$ satisfy $\lim_{n\rightarrow\infty} F_n(x) = F(x)$ for each point $x \in \mathbb{R}$ at which $F$ is continuous. We can show that, for any $\varepsilon > 0$, there are $A$ and $B$ such that for all $\alpha > A$, $\beta < B$, $$\sup_{x \in \mathbb{R}} |F_{\alpha,\beta}(x) - F(x)| < \varepsilon.$$ This is enough to extract a sequence $\alpha_n, \beta_n$.

This turned into quite a lengthy post, so let me just say the idea is simple: you approximate the density with piecewise constant functions, and all that matters is the areas under the curves converge uniformly.

enter image description here

Let then $\varepsilon > 0$ be given, and let $\Phi$ denote the cdf of a standard Gaussian. There is $A > 0$ large enough that $\Phi(-A) < \varepsilon/4$, which by symmetry also implies $\Phi(A) > 1-\varepsilon/4$. Fix some $\alpha > A$. We have just cut off the tails.

Given $x_i = -\alpha + i\beta$ with $n = 2\alpha/\beta \in \mathbb{Z}$, there are $n$ intervals $I_i = [x_i,x_{i+1})$ that cover $[-\alpha, \alpha)$. Assuming $p_i = \Phi(x_{i+1}) - \Phi(x_i)$, the total probability mass allocated is $1 - 2\Phi(-\alpha)$; the remaining mass can be assigned anywhere outside of $[-\alpha,\alpha)$; say it is assigned to $x > \alpha$. I'll ignore any technicalities with the right end-point (it has probability 0).

Define a "locator" map $\ell : [-\alpha, \alpha) \rightarrow \{0, ..., n-1\}$ which associates to any $x$ the unique index $i$ of the left end-point in the interval $I_i$ (so in particular $\ell(x_i) = i)$. Remembering that the density of the $i^{th}$ uniform random variable is $(1/\beta)1_{I_i}$, the cdf $F_{\alpha, \beta}$ satisfies $$F_{\alpha, \beta}(x) = p_{\ell(x)}\frac{x - x_{\ell(x)}}{\beta} + F_{\alpha,\beta}(x_{\ell(x)}),$$ and note that the approximate cdf agrees with $\Phi$ at the discretization points $x_i$ up to a shift by $\Phi(-\alpha)$: $$F_{\alpha,\beta}(x_i) = \sum_{i'=1, ..., i-1} p_{i'} = \sum_{i' = 1,...,i-1} (\Phi(x_{i'+1}) - \Phi(x_{i'})) = \Phi(x_{i}) - \Phi(-\alpha).$$ Thus, for any $x \in [-\alpha, \alpha)$, \begin{align*} F_{\alpha,\beta}(x) - \Phi(x) &= p_{\ell(x)}(x - x_{\ell(x)})/\beta + F_{\alpha,\beta}(x_{\ell(x)}) - \Phi(x) \\ &= p_{\ell(x)}(x - x_{\ell(x)})/\beta + \Phi(x_{\ell(x)}) - \Phi(-\alpha) - [\Phi(x) - \Phi(x_{\ell(x)}) + \Phi(x_{\ell(x)})]\\ &= [p_{\ell(x)}(x - x_{\ell(x)})/\beta - (\Phi(x) - \Phi(x_{\ell(x)}))] - \Phi(-\alpha).\tag{1} \end{align*} The left term in brackets in the last equality above is $$(\Phi(x_{\ell(x)+1}) - \Phi(x_{\ell(x)}))(x - x_{\ell(x)})/\beta - (\Phi(x) - \Phi(x_{\ell(x)})),$$ which, if you squint, is the fundamental theorem of calculus: $$\Phi'(a)(x-a) \approx \frac{\Phi(b) - \Phi(a)}{\beta}(x - a) \approx (\Phi(x) - \Phi(a)).$$ I leave it to the reader to justify using compactness of $[-\alpha,\alpha]$ and differentiability of $\Phi$ on $(-\alpha,\alpha)$ that one can find $B > 0$ such that any $\beta < B$ makes the term in brackets as small as desired, less than $\varepsilon/2$.

Going back to $(1)$, we find that for $\alpha > A$ and $\beta < B$ and $x \in [-\alpha, \alpha)$, we get $$|F_{\alpha,\beta}(x) - \Phi(x)| < \varepsilon/2 + \varepsilon/4.$$ For the remaining $x$, we've misplaced at most $2\Phi(-\alpha)$ mass, which is bounded by $\varepsilon/2$. Thus, $$\sup_{x \in \mathbb{R}} |F_{\alpha,\beta}(x) - \Phi(x)| < \varepsilon,$$ which establishes the desired convergence.

Related Question