Central Limit Theorem for Bounded Difference Functions

pr.probability

Let $X_1, \ldots, X_n$ be independent and identically distributed random variables. Let $f:\mathbb{R}^n \to \mathbb{R}$ be a bounded difference function, i.e., for any $x,y \in \mathbb{R}^n$ that differ only in coordinate $i$, we have $|f(x) – f(y)| \leq c_i$. Under this setting, McDiarmid’s inequality provides a well-known concentration result.

Under the same setting, do we also have a central limit theorem for an appropriately scaled version of $f(X_1, \ldots, X_n) – E[f(X_1, \ldots, X_n)]$ ? I am ok with making additional assumptions on $(X_i)_i$ sequence.

Best Answer

In the general setting you propose, the answer is negative. Suppose that $X_i$ take the values $\pm 1$ with equal probability. Let $h: {\mathbb R} \to {\mathbb R}$ be a Lipschitz function with constant 1, e.g., $$ (*) \quad h(x)=\max\{x,0\} \, ,$$ and define $$f_n(x_1,\dots,x_n):=\sqrt{n} \cdot h \left(\frac{x_1+\dots +x_n}{\sqrt{n}}\right) \,.$$ (In the special case (*), one can skip dividing and multiplying by $\sqrt{n}$.) Then $f_n$ satisfy the bounded difference condition with constants $c_i=2$ on the support of the distribution.

(If you want the bounded difference condition as stated to hold everywhere, consider the functions $\widetilde{f}_n(x_1,\dots,x_n):=f_n({\rm sign}(x_1),\dots,{\rm sign}(x_n))$ instead.)

Let $Z$ be a standard normal variable. Then we have the convergence in law $$\frac{f(X_1, \ldots, X_n) - E[f(X_1, \ldots, X_n)]}{\sqrt{{\rm Var} [f(X_1, \ldots, X_n)]}} \Rightarrow \frac{h(Z)-E[h(Z)]}{\sqrt{{\rm Var}[h(Z)] }}\,. $$ The distribution of the right-hand side is not Gaussian for most functions $h$. In the special case (*), this distribution will have a density which is the right half of the Gaussian density, shifted and scaled to have mean zero and variance 1.

Postscript: Perhaps you can post another question with the particular bounded difference functions $f_n$ you have in mind. One proof of Mcdiarmid's inequality uses Martingales, so perhaps in your application a Martingale CLT will be useful. See https://en.wikipedia.org/wiki/Martingale_central_limit_theorem and the references therein.