Sure there is.
Let's assume that we could see $k$ different types of dice with $Y=\{y_1, \dots, y_k\}$ pips. The "classical" D&D dice would yield $Y=\{2, 4, 6, 8, 10, 12, 20, 100\}$, but anything else could be made to work as below. Let $|Y|$ denote the number of different possible dice.
Assume that the minimum observed roll is $m$ and the maximum is $M$. This gives us bounds on the possible number of dice involved: we must have
$$m\leq x\leq M.$$
Now, simulate a large number of rolls from each remaining candidate xdy (by the formula above, there are $(M-m+1)\times |Y|$ different candidates). This large number can be larger than the number of original observations. In each case, create a histogram of total outcomes.
Finally, compare these density histograms to the histogram of the original observations. (We work with the densities, not the frequencies, so we can use different numbers of simulations from the original number of observations.) The earth mover's distance gives you a notion of distance between histograms. The simulation-based histogram with the lowest earth mover's distance to your original observation histogram is most consistent with your observations.
Here is some R code. We simulate 1000 rolls of 3d6 and use the standard D&D dice as candidates $Y$.
set.seed(1)
obs <- rowSums(matrix(ceiling(runif(3000,0,6)),nrow=1000))
candidates.y <- c(4,6,8,10,12,20,100) # standard D&D dice
breaks <- c(seq(.5,max(obs)+.5,by=1),max(obs)*max(candidates.y))
hist.obs <- hist(obs,breaks=breaks,freq=FALSE)
candidates.x <- 1:max(obs)
n.sims <- 1000
library(emdist)
emd.dist <- matrix(NA,nrow=length(candidates.x),ncol=length(candidates.y),
dimnames=list(candidates.x,candidates.y))
for ( ii in seq_along(candidates.x) ) {
for ( jj in seq_along(candidates.y) ) {
set.seed(1) # for replicability
obs.sim <- rowSums(matrix(ceiling(
runif(candidates.x[ii]*n.sims,0,candidates.y[jj])),
nrow=n.sims))
hist.sim <- hist(obs.sim,breaks=breaks,plot=FALSE)
emd.dist[ii,jj] <- emd2d(matrix(hist.obs$density,nrow=1),
matrix(hist.sim$density,nrow=1))
}
}
emd.dist
Output:
4 6 8 10 12 20 100
1 7.9029999 6.8909998 5.8920002 4.8950000 3.88700008 1.9228243 0.39887184
2 5.4299998 3.4460001 1.4499999 0.8147613 1.08018708 0.8099990 0.04417419
3 2.9579999 0.0000000 1.8063331 1.9216927 1.61559486 0.2684011 0.35928789
4 0.6180000 2.4049284 2.2750988 1.3528987 0.71173453 0.2177625 1.00000000
5 1.9768735 2.8317218 1.7545075 0.5033562 0.17459151 0.1226299 1.00000000
6 3.4169219 1.9913402 0.4674550 0.1918677 0.05264558 1.0000000 1.00000000
7 3.3924465 0.9393466 0.2968451 0.1572183 0.21881166 1.0000000 1.00000000
8 2.6134274 0.3687483 0.1226299 1.0000000 1.00000000 1.0000000 1.00000000
9 1.5483801 0.3894975 0.3592879 1.0000000 1.00000000 1.0000000 1.00000000
10 0.3270817 0.2188117 1.0000000 1.0000000 1.00000000 1.0000000 1.00000000
11 0.1052912 0.3592879 1.0000000 1.0000000 1.00000000 1.0000000 1.00000000
12 0.3592879 1.0000000 1.0000000 1.0000000 1.00000000 1.0000000 1.00000000
13 1.0000000 1.0000000 1.0000000 1.0000000 1.00000000 1.0000000 1.00000000
14 1.0000000 1.0000000 1.0000000 1.0000000 1.00000000 1.0000000 1.00000000
15 1.0000000 1.0000000 1.0000000 1.0000000 1.00000000 1.0000000 1.00000000
16 1.0000000 1.0000000 1.0000000 1.0000000 1.00000000 1.0000000 1.00000000
17 1.0000000 1.0000000 1.0000000 1.0000000 1.00000000 1.0000000 1.00000000
18 1.0000000 1.0000000 1.0000000 1.0000000 1.00000000 1.0000000 1.00000000
And look, the 3d6 in fact does have the simulated histogram with the lowest distance from your observations' histogram. Hurray!
A couple of comments:
- This works better if you have "many" observations. With few observations, you get a lot of noise, and many different simulated histograms are consistent with your noisy data.
- Of course, the larger your candidate set $Y$, the lower your chance will be of getting the true values.
- If the true $y$ is not in $Y$, you will at least get something "close to it".
- The formally-statistically-correct treatment would be a Bayesian one, putting equal prior probabilities on all possible $x$ and $y\in Y$, then calculating or simulating draws and deriving posterior probabilities on pairs $(x,y)$. Yes, this can be done rigorously. Anyone want to have a go?
I'll outline some things but avoid explicit solution. The point is for you to try to work it out.
At each step, you will have some amount of points, $M$. If you stand for the next roll you will either win $k$ more points (giving you $M+k$) or you will lose $M$ (leaving you with $0$).
The expected value-maximizing strategy would say that any time a $5/6$ chance of $k$ is worth more than a $1/6$ chance of $M$ -- i.e. when the expected gain from playing another round would be positive -- you benefit (on average) from continuing. A little mental arithmetic lets you see when to sit down for a variety of similar games of this type (as long as its not a game that becomes much, much more profitable later, there's no need to sum series, you can just stay standing until the net gain on the next roll is no longer positive).
Imagine, for example, you played the first round and got $100 $points. Now in the second round you would get $200$ points if you win ($5/6$ chance) and lose $100$ points if you got a $5$ ($1/6$ chance). The expected winnings is $200\times 5/6$ and the expected loss is $100 \times 1/6$, for an expected return of $900/6 = 150$; on average you stand to win much more than lose.
However, if you are trying to be the person with the most points of all at the end of many such games of Greed, maximizing expected value each turn is not the optimum strategy to be the eventual winner!
You may be better taking a smaller chance of a larger return, or banking smaller, surer gains, depending on where you are placed relative to other players.
Imagine you're coming second but the person coming first is a fair way ahead of you. If you sit down at the point that would maximize your expected return for that single round, you may simply make it certain you don't win. Similarly if you're far ahead in the last round, you will be better off sitting down much earlier than would maximize your expected return that round -- you may be unnecessarily risking a near-certain win if you continue.
If you were to be playing many more rounds you would (approximately) seek to maximize expected value per turn, but as you get toward the last few rounds the winning strategy changes: if you're behind, you should take more risk, if you're ahead, less risk. When you're getting close to the end, it's not just the current round of points that matter if you want to maximize the chance of being ahead at the end.
You can work out the exact strategy mathematically, but my guess is that the teacher is trying to get you to undertake the simpler task to maximize your expected winnings (which works quite well in the early to mid stages of the iterated game).
You might like to look at articles on strategy for the dice game Pig (of which this is a variant). There's lots of information on strategy to be found for various versions of this game online, but its more fun to work it out in detail yourself. Here is one document that also makes the point about the expectation-maximizing strategy not being optimal. Your game is a little different (your payoff increases and everyone gets the same rolls) but the basic ideas are the same.
Best Answer
You can think about your problem as a Markov chain, i.e., a set of states with certain transition probabilities between states. You start in one state (all cards face up) and end up in an absorbing state (all cards face down). Your question is about the expected number of steps until you reach that absorbing state, either for a single chain, or for the expected minimum number of steps across $n$ independent Markov chains running simultaneously.
And there are actually two slightly different ways to look at this. The first one, as whuber comments, is to consider the six cards as six different bits $\{0,1\}$ and consider the state as a six-vector in $\{0,1\}^6$, i.e., the six-dimensional discrete hypercube. We start out at the vertex $(0,0,0,0,0,0)$, and the absorbing state is $(1,1,1,1,1,1)$. A step can take us to an adjacent vertex, in which exactly one bit is flipped with respect to the original state. That is, transitions take us from one vertex to any neighboring one with Hamming distance exactly one, and each such neighbor has an equal probability of being the next state.
There is some literature on random walks and Markov chains on discrete cubes with Hamming distances, but nothing I could locate at short notice. We have a very nice thread on Random walk on the edges of a cube, which might be interesting.
The second way to look at this is to use the fact that all cards are interchangeable (assuming a fair die). Then we can use just seven different states, corresponding to the number of cards face down. We start in the state $i=0$, and the absorbing state is $i=6$. The transition probabilities depend on the state we are in:
Overall, we get the following transition matrix:
$$ T=\begin{pmatrix} 0 & \frac{6}{6} & 0 & 0 & 0 & 0 & 0 \\ \frac{1}{6} & 0 & \frac{5}{6} & 0 & 0 & 0 & 0 \\ 0 & \frac{2}{6} & 0 & \frac{4}{6} & 0 & 0 & 0 \\ 0 & 0 & \frac{3}{6} & 0 & \frac{3}{6} & 0 & 0 \\ 0 & 0 & 0 & \frac{4}{6} & 0 & \frac{2}{6} & 0 \\ 0 & 0 & 0 & 0 & \frac{5}{6} & 0 & \frac{1}{6} \\ 0 & 0 & 0 & 0 & 0 & 0 & 1 \end{pmatrix} $$
We start with certainty in the state $i=0$. We can encode the probabilities for each state at a certain point with a vector $v\in[0,1]^7$, and our starting state corresponds to $v_0=(1,0,0,0,0,0,0)$.
Here is a fundamental fact about Markov chains (which is easy to see and to prove via induction): the probabilities for the state after $k$ transitions are given by $v_k=(T')^kv_0$. (That is $T$ transposed. You can also work with row vectors $v$, then you don't need to transpose, but "$v_0T^k$" takes a little getting used to.)
Thus, the probability that we have ended up in the absorbing state $i=6$ after $k$ steps is precisely the last entry in that vector, or $v_k[6]=((T')^kv_0)[6]$. Of course, we could already have been in the absorbing state after $k-1$ steps. So the probability that our Markov chain ends up in the absorbing state for the first time after $k$ steps is
$$ p_k := ((T')^kv_0)[6]-((T')^{k-1}v_0)[6]. $$
We can numerically calculate $p_k$ for a large enough number of $k\leq K$ such that $\sum_{k=0}^Kp_k\approx 1$, and there may even be a closed form solution. Then, given $p_k$, we can calculate the expectation as
$$ \sum_{k=0}^\infty kp_k \approx \sum_{k=0}^K kp_k. $$
Next, assume we have $n$ players, and we want to know after how many steps the game will end, i.e., when the first player has all their cards face down. We can easily calculate the probability $q_k^n$ that at least one player has all cards face down after $k$ or fewer steps by noting that
$$ \begin{align*} q_k^n &= P(\text{at least one player has all cards face down after $k$ or fewer steps}) \\ &= 1-P(\text{all $n$ players need at least $k+1$ steps}) \\ &= 1-P(\text{ONE player needs at least $k+1$ steps})^n \\ &= 1-\bigg(\sum_{j=k+1}^\infty p_j\bigg)^n \\ &= 1-\bigg(1-\sum_{j=0}^k p_j\bigg)^n. \end{align*} $$
From this, we can derive the probability $p^n_k$ that a game of $n$ players ends after exactly $k$ steps:
$$ p^n_k = q_k^n-q_{k-1}^n = \bigg(1-\sum_{j=0}^{k-1} p_j\bigg)^n-\bigg(1-\sum_{j=0}^k p_j\bigg)^n. $$
And from this, in turn, we can again calculate the expected length of a game with $n$ players:
$$ \sum_{k=0}^\infty kp^n_k \approx \sum_{k=0}^K kp^n_k. $$
As I wrote above, there may be a closed form solution for the $p_k$, but for now, we can numerically evaluate them using R. I'm using $K=10,000$, so that $\sum_{k=0}^K p_k=1$ up to machine accuracy.
Next, this is how we get the probabilities $p^4_k$ for $n=4$ players:
Of course, four persons finish more quickly than a single person. For $n=4$, we get an expected value of
Finally, I like to confirm calculations like this using simulation.
The distribution of the numbers of steps needed in our $10^5$ simulated games matches the calculated $p^4_k$ rather well:
Finally, the mean of the steps needed in the simulated games also matches the calculated expectation quite well: