Sure there is.
Let's assume that we could see $k$ different types of dice with $Y=\{y_1, \dots, y_k\}$ pips. The "classical" D&D dice would yield $Y=\{2, 4, 6, 8, 10, 12, 20, 100\}$, but anything else could be made to work as below. Let $|Y|$ denote the number of different possible dice.
Assume that the minimum observed roll is $m$ and the maximum is $M$. This gives us bounds on the possible number of dice involved: we must have
$$m\leq x\leq M.$$
Now, simulate a large number of rolls from each remaining candidate xdy (by the formula above, there are $(M-m+1)\times |Y|$ different candidates). This large number can be larger than the number of original observations. In each case, create a histogram of total outcomes.
Finally, compare these density histograms to the histogram of the original observations. (We work with the densities, not the frequencies, so we can use different numbers of simulations from the original number of observations.) The earth mover's distance gives you a notion of distance between histograms. The simulation-based histogram with the lowest earth mover's distance to your original observation histogram is most consistent with your observations.
Here is some R code. We simulate 1000 rolls of 3d6 and use the standard D&D dice as candidates $Y$.
set.seed(1)
obs <- rowSums(matrix(ceiling(runif(3000,0,6)),nrow=1000))
candidates.y <- c(4,6,8,10,12,20,100) # standard D&D dice
breaks <- c(seq(.5,max(obs)+.5,by=1),max(obs)*max(candidates.y))
hist.obs <- hist(obs,breaks=breaks,freq=FALSE)
candidates.x <- 1:max(obs)
n.sims <- 1000
library(emdist)
emd.dist <- matrix(NA,nrow=length(candidates.x),ncol=length(candidates.y),
dimnames=list(candidates.x,candidates.y))
for ( ii in seq_along(candidates.x) ) {
for ( jj in seq_along(candidates.y) ) {
set.seed(1) # for replicability
obs.sim <- rowSums(matrix(ceiling(
runif(candidates.x[ii]*n.sims,0,candidates.y[jj])),
nrow=n.sims))
hist.sim <- hist(obs.sim,breaks=breaks,plot=FALSE)
emd.dist[ii,jj] <- emd2d(matrix(hist.obs$density,nrow=1),
matrix(hist.sim$density,nrow=1))
}
}
emd.dist
Output:
4 6 8 10 12 20 100
1 7.9029999 6.8909998 5.8920002 4.8950000 3.88700008 1.9228243 0.39887184
2 5.4299998 3.4460001 1.4499999 0.8147613 1.08018708 0.8099990 0.04417419
3 2.9579999 0.0000000 1.8063331 1.9216927 1.61559486 0.2684011 0.35928789
4 0.6180000 2.4049284 2.2750988 1.3528987 0.71173453 0.2177625 1.00000000
5 1.9768735 2.8317218 1.7545075 0.5033562 0.17459151 0.1226299 1.00000000
6 3.4169219 1.9913402 0.4674550 0.1918677 0.05264558 1.0000000 1.00000000
7 3.3924465 0.9393466 0.2968451 0.1572183 0.21881166 1.0000000 1.00000000
8 2.6134274 0.3687483 0.1226299 1.0000000 1.00000000 1.0000000 1.00000000
9 1.5483801 0.3894975 0.3592879 1.0000000 1.00000000 1.0000000 1.00000000
10 0.3270817 0.2188117 1.0000000 1.0000000 1.00000000 1.0000000 1.00000000
11 0.1052912 0.3592879 1.0000000 1.0000000 1.00000000 1.0000000 1.00000000
12 0.3592879 1.0000000 1.0000000 1.0000000 1.00000000 1.0000000 1.00000000
13 1.0000000 1.0000000 1.0000000 1.0000000 1.00000000 1.0000000 1.00000000
14 1.0000000 1.0000000 1.0000000 1.0000000 1.00000000 1.0000000 1.00000000
15 1.0000000 1.0000000 1.0000000 1.0000000 1.00000000 1.0000000 1.00000000
16 1.0000000 1.0000000 1.0000000 1.0000000 1.00000000 1.0000000 1.00000000
17 1.0000000 1.0000000 1.0000000 1.0000000 1.00000000 1.0000000 1.00000000
18 1.0000000 1.0000000 1.0000000 1.0000000 1.00000000 1.0000000 1.00000000
And look, the 3d6 in fact does have the simulated histogram with the lowest distance from your observations' histogram. Hurray!
A couple of comments:
- This works better if you have "many" observations. With few observations, you get a lot of noise, and many different simulated histograms are consistent with your noisy data.
- Of course, the larger your candidate set $Y$, the lower your chance will be of getting the true values.
- If the true $y$ is not in $Y$, you will at least get something "close to it".
- The formally-statistically-correct treatment would be a Bayesian one, putting equal prior probabilities on all possible $x$ and $y\in Y$, then calculating or simulating draws and deriving posterior probabilities on pairs $(x,y)$. Yes, this can be done rigorously. Anyone want to have a go?
TL;DR: if $p$ = 1/6 and you want to know how large $n$ needs to be 98% sure the dice is fair (to within 2%), $n$ needs to be at least $n$ ≥ 766.
Let $n$ be the number of rolls and $X$ the number of rolls that land on some specified side. Then $X$ follows a Binomial(n,p) distribution where $p$ is the probability of getting that specified side.
By the central limit theorem, we know that
$$\sqrt{n} (X/n - p) \to N(0,p(1-p))$$
Since $X/n$ is the sample mean of $n$ Bernoulli$(p)$ random variables. Hence for large $n$, confidence intervals for $p$ can be constructed as
$$\frac{X}{n} \pm Z \sqrt{\frac{p(1-p)}{n}}$$
Since $p$ is unknown, we can replace it with the sample average $\hat{p} = X/n$, and by various convergence theorems, we know the resulting confidence interval will be asymptotically valid. So we get confidence intervals of the form
$$\hat{p} \pm Z \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}$$
with $\hat{p} = X/n$. I'm going to assume you know what $Z$-scores are. For example, if you want a 95% confidence interval, you take $Z=1.96$. So for a given confidence level $\alpha$ we have
$$\hat{p} \pm Z_\alpha \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}$$
Now let's say you want this confidence interval to be of length less than $C_\alpha$, and want to know how big a sample we need to make this case. Well this is equivelant to asking what $n_\alpha$ satisfies
$$Z_\alpha \sqrt{\frac{\hat{p}(1-\hat{p})}{n_\alpha}} \leq \frac{C_\alpha}{2}$$
Which is then solved to obtain
$$n_\alpha \geq \left(\frac{2 Z_\alpha}{C_\alpha}\right)^2 \hat{p}(1-\hat{p})$$
So plug in your values for $Z_\alpha$, $C_\alpha$, and estimated $\hat{p}$ to obtain an estimate for $n_\alpha$. Note that since $p$ is unknown this is only an estimate, but asymptotically (as $n$ gets larger) it should be accurate.
Best Answer
I would recommend analyzing this in the following way:
Count each role in which George successfully predicts the outcome as a success, and every other one as a failure. Then, you easily calculate a probability of success for George, and a 95% or 99% confidence interval. Does he claim he can predict the outcome "twice as well" as randomly rolling the dice? Then:
H0: p >= 1/3
H1: p < 1/3
(assuming a 6-sided die).
From there, it's pretty straightforward to do the hypothesis test. Also, you can calculate the power a priori pretty easily (even in something like Excel). Pick a number of rolls (like 10), and then make a table with the possible successes as rows (0-10). Then, for each success, calculate the probability he'll have that many successes (if he were to be just guessing, which is what we're assuming he is doing). Also, for each value, determine if it would lead to a rejection or acceptance of the null. Then, to find the power, you can simply add up all the probabilities where the null would be rejected.