Probability of rolling 3 sided die X times

combinatoricsprobability

We have a fair $3$ sided die $(a,b,c)$, and we perform the following experiment:

Roll the die until we have seen $10$ of any of the sides, let $X$ be the number of times the die was rolled.

eg. ${bcbaaaaaaaaaa,bbbbbbbbbb,aaaaaaacacccccccccc}$

We are interested in the probability of $X$ for all possible values of $X$, ie. $X={10,11…28}$.

Best Answer

Did you also get these probabilities:

P(X=10) = 1/19683 = 0.00005080526342529086
P(X=11) = 20/59049 = 0.0003387017561686057
P(X=12) = 220/177147 = 0.0012419064392848876
P(X=13) = 1760/531441 = 0.0033117505047597006
P(X=14) = 11440/1594323 = 0.0071754594269793515
P(X=15) = 64064/4782969 = 0.013394190930361455
P(X=16) = 320320/14348907 = 0.022323651550602425
P(X=17) = 1464320/43046721 = 0.03401699283901322
P(X=18) = 6223360/129140163 = 0.04819073985526873
P(X=19) = 24893440/387420489 = 0.06425431980702496
P(X=20) = 94410316/1162261467 = 0.08122984257895904
P(X=21) = 339951040/3486784401 = 0.09749700609607609
P(X=22) = 1157496340/10460353203 = 0.11065556941882548
P(X=23) = 3698815120/31381059609 = 0.11786775736977315
P(X=24) = 10984667980/94143178827 = 0.11668044479553549
P(X=25) = 29915691520/282429536481 = 0.10592267328956415
P(X=26) = 73036356250/847288609443 = 0.08620009219528332
P(X=27) = 151915621000/2541865828329 = 0.05976539725539643
P(X=28) = 75957810500/2541865828329 = 0.029882698627698216

I did it with a Markov chain where the states are sorted 3-tuples of how many we have gotten of each value and there is an absorbing state "we have gotten at least 10 of some value". In another words the states at level $k$ (after $k$ dice throws) are the integer partitions of $k$ into at most $3$ parts. Note: we don't have to keep track of "which is which" in the counts tuple, always sort it when checking the transitions.

Here's a picture of the transition graph for $n=4$ (the direction is towards larger sum (and always towards the absorbing state)):

transition graph for n=4

There were $221$ states in total for the $n=10$ case. Note: the matrix is very sparse (only max $3$ transitions from each state), so this can quicken the calculations of its powers (which we now want to calculate all upto $28$ so iteratively it is :-)).

I noticed that for $m=10, 11, \dots, 19$ the formula

$$P(X=m) = \frac{ {{m-1}\choose{9}}2^{m-10} }{3^{m-1}}$$

works and for $m=20$ we subtract the central binomial coefficient:

$$P(X=20) = \frac{ {{m-1}\choose{9}}2^{m-10} - {{20}\choose{10}} }{3^{m-1}}$$

but after that I don't see how the formula should be modified. And I don't immeaditely see a reason for that formula either, I just found it by checking OEIS with the numerators of the probabilities.

EDIT Then, for $m=21$:

$$P(X=21) = \frac{ {{m-1}\choose{9}}2^{m-10} - {{m}\choose{10}}2 - {{m/2}\choose{7}}2^{14} }{3^{m-1}}$$

Notice, there is a generilized binomial coeffient ${{21/2}\choose{7}} = \frac{415701}{2048}$ in there.

My guess is it comes some how straigh-forwardly from the matrix multiplication (there are punch of $1$'s and $2$'s in there and of course the denominator $3$) upto some point, but when the matrix becomes "full enough", the pattern breaks.

EDIT The matrix is constructed by first finding all the possible states and initializing it to a zero matrix. Then, for each state (it is a triplet) find the transitions (total $3$ of them) by adding a $1$ to a particular place and sorting the outcome. Then find the index of that in the states list (or if it goes over $n$ take the last index, which represents the absorbing state) and increment the matrix by $1$ in the correct spot. So we keep the matrix as an integer matrix but the probabilities are obtained by dividing by $3$ (and by $3^t$ in the $t$th power).

Here's the Python code (It's kind of messy, since I changed the matrix multiplication calculations to be done by only storing the non-zero entries of the matrix to speed up the calculation. But the methods for making the states and the matrix are first):

from fractions import Fraction
import random

#list of all partitions of k to at most maxCount parts and each part <= maxVal
def getPartsRec(k, maxCount, maxVal):
    if k==0: return [[]]
    maxPossible = maxCount*maxVal
    if k>maxPossible: return []
    if k==maxPossible: return [[maxVal]*maxCount]
    ret = []
    for x in xrange(min(k,maxVal), -1, -1):
        ret += [[x] + sP for sP in getPartsRec(k-x, maxCount-1, x)]
    return ret

#list of all nonnegative triplets that sum to k and have maximum value k
def getParts3(k):
    return [p+[0]*(3-len(p)) for p in getPartsRec(k, 3, k)]


def getDiceStates(n):
    states = []
    for i in xrange(3*n-1):
        ps = getParts3(i)
        states += [p for p in ps if p[0]<n]
    return states

#the transition matrix with integers (divide by 3 to make them probs)
#the last state is the absorbing one (have gotten at least n of some value)
def diceMat(n):
    states = getDiceStates(n) + [[n, 0, 0]]
    statesN = len(states)
    a = [[0 for j in xrange(statesN)] for i in xrange(statesN)]
    for i in xrange(statesN):
        for h in xrange(3):
            toStateInd = None
            toState = states[i][:]
            toState[h] += 1
            toState = sorted(toState, reverse=True)
            if toState[0]>=n: toStateInd = statesN-1
            else: toStateInd = states.index(toState)
            a[i][toStateInd] += 1

    return a

def diceProbs(n):
    a = diceMat(n)
    mat = a[:][:]
    matLen = len(a)
    probs = [0]
    probsLen = 3*n-2
    denom = 3
    for i in xrange(probsLen):
        probs.append(Fraction(mat[0][matLen-1], denom))
        mat = matMul(mat, a)
        denom *= 3
    return [Fraction(0, 1)]+[probs[i]-probs[i-1] for i in xrange(1, len(probs))]


def matMul(a,b):
    zip_b = zip(*b)
    # uncomment next line if python 3 : 
    # zip_b = list(zip_b)
    return [[sum(ele_a*ele_b for ele_a, ele_b in zip(row_a, col_b)) 
             for col_b in zip_b] for row_a in a]

#Get the weight of a[i][j] in a matrix-object a
def getWeight(a, i, j):
    row = a[i]
    for k in xrange(len(row)):
        if row[k]['j'] == j: return row[k]['w']
    return 0

#A faster method to multiply two sparse square matrices
#A "matrice-object" is a list of {'j', 'w'}'s that give which indices j
# on that row are non-zero and what weights do they have
def multiplyMatObs(a, b):
    n = len(a)
    ret = []
    for i in xrange(n):
        retRow = []
        for j in xrange(n):
            s = 0
            for k in xrange(len(a[i])):
                s += a[i][k]['w']*getWeight(b, a[i][k]['j'], j)
            if s!=0:
                retRow.append({'j':j, 'w': s})
        ret.append(retRow)
    return ret

#Get a square matrix by multiplying a matOb a by a true square matrix b
def multiplyMatObAndMat(a, b):
    n = len(a)
    ret = []
    for i in xrange(n):
        retRow = []
        for j in xrange(n):
            s = 0
            for k in xrange(len(a[i])):
                el = a[i][k]
                s += el['w']*b[el['j']][j]
            retRow.append(s)
        ret.append(retRow)
    return ret


def diceMatOb(n):
    mat = diceMat(n)
    a = []
    for i in xrange(len(mat)):
        aRow = []
        for j in xrange(len(mat[i])):
            if mat[i][j]!=0:
                aRow.append({'j': j, 'w': mat[i][j]})
        a.append(aRow)
    return a


#with "matrix-objects"
def diceProbs2(n):
    a = diceMatOb(n)
    mat = diceMat(n)
    matLen = len(mat)
    probs = [0]
    probsLen = 3*n-2
    denom = 3
    for i in xrange(probsLen):
        probs.append(Fraction(mat[0][matLen-1], denom))
        mat = multiplyMatObAndMat(a, mat)
        denom *= 3
    return [Fraction(0, 1)]+[probs[i]-probs[i-1] for i in xrange(1, len(probs))]


def simuProbs(n, simuN):
    endCounts = [0]*(3*n-1)
    counts = [0,0,0]
    for i in xrange(simuN):
        counts = [0,0,0]
        step = 0
        while counts[0]<n and counts[1]<n and counts[2]<n:
            counts[random.randint(0, 2)] += 1
            step += 1
        endCounts[step] += 1
    return [float(x)/simuN for x in endCounts]


#print [len(getParts3(n)) for n in range(20)]
n = 4
#fracProbs = diceProbs(n)
#print [str(x) for x in fracProbs]
#print [float("{0:.4f}".format(round(float(x), 4))) for x in fracProbs]
#print simuProbs(n, 10000)


#print "\nmatOb method"

fracProbs2 = diceProbs2(n)
print [str(x) for x in fracProbs2]
print [float("{0:.4f}".format(round(float(x), 4))) for x in fracProbs2]
print simuProbs(n, 10000)
Related Question