[Math] statistical approach to multinomial distribution

pr.probabilityst.statistics

Suppose a dice with $q$ faces is rolled $N$ times, where $N$ is very big.

We define a multinomial variable $X=(X_1,\ldots,X_q)$ which counts how many times any face is occurred ($X_i$ is the number of occurrence of the $i$-th face).

Suppose we don't know if the dice is fair or not, namely if the probability distribution of the outcome of the dice is uniform or not.

If we know the value of $X$, how can we use it to estimate the probability distribution of the dice?

In particular, let $\epsilon>0$ be fixed.
How can we use the value of $X$ to understand if there's a face $i$ such that $|P(\mbox{the dice's outcome is } i) -\frac{1}{q}|>\epsilon$?

Clearly, I expect to use a statistical method, hence my prediction can be wrong, but I would like to esteem my error probability.

ps: in the case $q=2$, it can be made defining the binomial which counts how many times one of the two faces occurs.
If $N$ is big, that binomial can be approximated by a gaussian and the gaussian has mean $\frac{N}{2}$ if and only if the dice is fair.
Thus we can settle a threshold $T>0$ and say that the dice is fair if $|X-\frac{N}{2}|>T$ and is unfair otherwise.
$T$ is choosen according to the minimum value for $\epsilon$ and the error probability can be easily computed with the normal table.

Best Answer

Use concentration of measure. For 2 dimensions, Hoeffding's inequality. See the appendix of the book on empirical processes by van der Vaart and Wellenr for the multinomial case.

Related Question