[Math] Repeatedly rolling a die and the tails of the multinomial distribution.

distribution-tailsinequalityprobabilityprobability distributionsprobability theory

For $1\leq i\leq n$ let $X_i$ be independent random variables, and let each $X_i$ be the uniform distribution on the set ${0,1,2,\dots,m}$ so that $X_i$ is like an $m+1$ sided die. Let $$Y=\frac{1}{n}\sum_{i=1}^n \frac{1}{m} X_i,$$ so that $\mathbb{E}(Y)=\frac{1}{2}$. I am interested in the tails of this distribution, that is the size of $$\Pr\left( Y \geq k\right)$$ where $\frac{1}{2}< k\leq 1$ is a constant.

In the case where $m=1$, we are looking at the binomial distribution, and $$\Pr\left( Y \geq k\right)= \frac{1}{2^n}\sum_{i=0}^{(1-k) n} \binom{n}{i}$$ and we can bound this above by $(1-k)n \binom{n}{(1-k)n}$ and below by $\binom{n}{(1-k)n}$ which yields $$\Pr\left( Y \geq k\right)\approx \frac{1}{2^n} e^{n H(k)}$$ where $H(x)=-\left(x\log x+(1-x)\log (1-x)\right)$ is the entropy function. (I use approx liberally)

What kind of similar bounds do we have on the tails of this distribution when $m\geq 2$? I am looking to use the explicit multinomial properties to get something stronger than what you would get using Chernoff of Hoeffding.

Best Answer

Consider some i.i.d. random variables $\xi$ and $(\xi_n)_{n\geqslant1}$ with mean $\mathrm E(\xi)=0$ and finite exponential moment $\mathrm E(\mathrm e^{t|\xi|})$ for every $t$. Then, for every $x\gt0$, there exists $I(x)\gt0$ such that, when $n\to\infty$, $$ \mathrm P(\xi_1+\cdots+\xi_n\geqslant nx)=\mathrm e^{-nI(x)+o(n)}. $$ Furthermore, optimizing Chernoff exponential upper bound yields the exact value of the exponent $I(x)$. Namely, recall that, for every nonnegative $t$, $$ \mathrm P(\xi_1+\cdots+\xi_n\geqslant nx)\leqslant\left(\mathrm e^{-tx}\mathrm E(\mathrm e^{t\xi})\right)^n=\mathrm e^{-nI(t,x)}, $$ where $$ I(t,x)=tx-\log\mathrm E(\mathrm e^{t\xi}), $$ and it happens that $$ I(x)=\sup\limits_{t\geqslant0}I(t,x). $$ Note that $\mathrm E(\xi)=0$ hence $\mathrm E(\mathrm e^{t\xi})=1+o(t)$ when $t\to0$ and $I(t,x)=tx+o(t)$. In particular, $I(t,x)\gt0$ for $t\gt0$ small enough, hence $I(x)\gt0$ and the upper bound above is not trivial.

As stated above, this upper bound also provides the exact behaviour, in the exponential scale, of the probability of the large deviations event considered, that is, $$ \lim\limits_{n\to\infty}\frac1n\log\mathrm P(\xi_1+\cdots+\xi_n\geqslant nx)=-I(x). $$ In your setting, one considers $\xi=\frac1mX-\frac12$ and $x=k-\frac12$ hence $0\lt x\lt\frac12$. Note finally that, in general, $I(x)=I(t_x,x)$ where $t_x$ solves the equation $\partial_tI(t,x)=0$, that is, $$ x\mathrm E(\mathrm e^{t\xi})=\mathrm E(\xi\mathrm e^{t\xi}). $$

Related Question