[Math] the probability that dice are rigged

binomial distributiondicehypothesis testing

Scenario

We are given a single six-sided die. We produce a sequence of $n$ dice rolls, all of them rolling six. We do not know whether the die is fair or not.

How can we calculate the probability that the die is loaded or rigged given $n$ rolls of six?

In other words, what is the probability $q$ that the null hypothesis is false, given a series of $n$ successful Bernoulli trials?

Common fallacy

I heard a non-mathematician say that the probability of one six on a fair die is $\frac{1}{6}$, and so the probability of rolling 4 sixes in a row on a fair die is $\frac{1}{6^4} = \frac{1}{1296}$. So far, so good.

But then, he said that the probability that the die is not loaded is $\frac{1}{1296}$, and so the probability that the die is loaded is $\frac{1295}{1296}$.

This does not add up to me. By the same logic, if I roll the die once and get six, the probability that the die is loaded is $\frac{5}{6}$, which cannot be true. You don't call a person a cheat for rolling a six once.

My attempt to answer

I think that to answer this question, I have to use the binomial distribution somehow, since:

  • the probability of a six, fair or not, remains constant and equal to $p$
  • I am only interested in success/failure

At this point, I get lost. The problem is that I only know the probability for the null hypothesis $p_0 = \frac16$, and I don't know what the actual value for $p$ is. I don't know where to go from here.

Am I asking the wrong question? Must I set a confidence level $\alpha$? If so, suppose I set $\alpha = 0.05$? $\alpha = 0.01$? I apologize for any incorrect terminology. I am a computer programmer, not a statistician or mathematician.

Edit: It looks like I have to specify how badly the dice must be loaded before I call them unfair. Suppose I say rolling a six has to be at least $r = 10\%$ more likely than a fair die (i.e. $p \ge p_0\cdot\left(1 + r\right) = \frac{11}{60}$) before I call it rigged?

Best Answer

In classical statistics, there is no way to measure "the probability that the die is loaded". The confidence level in a statistical test tells you the probability of getting a false positive. That is, you're asking, if the die was actually fair, what would be the probability of getting a result as bad as this?

In Bayesian statistics, you can ask for "the probability that the die is loaded, given the result of the experiment", but you need to start with a prior: a probabilistic model that gives you, before the experiment, probabilities for the die to be fair or loaded in some particular way (or, perhaps, a probability distribution for just how loaded the die is). The choice of a prior tends to be a rather subjective thing, so you still can't really point to one number as "the probability that the die is loaded'.