Context
In tabletop roleplaying games (TTRPGs), the characters in the game usually have numeric scores tied to certain statistics. In general, these scores can be determined according to a bounded-value algorithm (e.g., one might ensure the sums of the scores for all characters are the same) or randomly according to some dice roll. It is of interest to analyze the probability involved with a dice-rolling method to compare its fairness with non-random methods.
Terminology
We obtain a single "score" by rolling four six-sided dice and discarding the lowest result. So, if $W$, $X$, $Y$, and $Z$ all follow the discrete uniform distribution on $[1,6]$, then the score is given by: $$W + X + Y + Z – \min\{W, X, Y, Z\}.$$
We obtain an "array" by generating six scores. For example, if we roll $$\{1,3,3,5\}, \{2,2,6,6\}, \{2,4,4,5\}, \{3,4,6,6\}, \{2,2,5,6\}, \{4,5,5,6\}$$ then our array would be $\{11, 14, 13, 16, 13, 16\}$. Arrays are unordered, so, $\{11, 14, 13, 16, 13, 16\}$ is equivalent to $\{11,13,13,14,16,16\}$, for example.
Problem
Given some array $A$, we seek to determine the probability that generating another array, $B$, results in an array that is "as good or better" than $A$. We give two definitions of this metric and wish to calculate probabilities for both.
- The sum of the scores in $B$ is greater than or equal to the sum of the scores in $A$.
- There exists an ordering of the scores in $A$ and $B$, $\{a_i\}$ and $\{b_i\}$, such that $\forall i, b_i \geq a_i$.
Attempts
Problem 1
For the first problem, we could disregard the structure of the array were it not for the fact that minimum rolls are dropped on a score-by-score basis: the distribution of sums for rolling $24$ dice and dropping the lowest $6$ is not a priori the same as the sum of scores in an array. So it would seem we need a way to model a single random variable $V \sim W + X + Y + Z – \min\{W, X, Y, Z\}$. If we can do that, then the random quantity of interest becomes a sum of six such variables, and the sum of random variables is something I am more familiar with. For example, the density function of the sum of two random variables is given by: $$f_{X_1 + X_2}(z) = \int_{-\infty}^\infty f_{X_1}(z – y)f_{X_2}(y)dy.$$ We could then use the combined density function to calculate the probability of getting a higher value. However, I get stuck at modeling the $\min\{W, X, Y, Z\}$ component of $V$, as I am unfamiliar with combining random variables in ways other than summation.
Problem 2
The second problem seems even more complicated, and many of the same issues apply as did for the first problem. One thought I had was to model the comparison of $A$ and $B$ with a binomial discrete distribution (which can be interpreted as the number of success among $n$ independent trials), but issues arise considering that the probability to "beat" each score in $A$ varies based on the value of the score and because the six scores are compared all at once, rather than in pairs.
Best Answer
This solution uses my Icepool Python library. You can try it in your browser using this JupyterLite notebook.
Problem 0: 4d6 drop lowest
This is small enough for brute force enumeration of all possibilities, though more efficient algorithms do exist. More on this later.
Denominator: 1296
Problem 1: Player A has total ability scores greater than or equal to player B
As you posted, the sum of dice can be computed using repeated convolution. From there one can simply find all of the joint outcomes, their probabilities, and choose the ones in which Player A has score greater or equal to Player B. These can be implemented as operators.
Denominator: 22452257707354557240087211123792674816
Problem 2: There exists a pairing of ability scores such that each of Player A's scores is greater than or equal to Player B's paired score
The trick is to express the problem not over the six pairs, but the values the scores can take. To wit: what is the chance, that for all values from 18 to 3, player A will have at least as many scores of at least that value as player B? (The equivalency is left as an exercise for the reader.)
From here, it turns out we can efficiently solve dice pool problems as long as we phrase the evaluation as a series of iterative state transitions over tuples of (outcome, how many dice in each pool rolled that outcome) and keep the number of states to a minimum. In this case the "dice" in the pool are entire ability scores rather than individual d6s. Here's an explanation of the underlying algorithm.
Denominator: 22452257707354557240087211123792674816
(This is why you don't roll ability scores if you want any semblance of fairness.)
In fact, we can also formulate the problem of "roll N dice, keep the M highest" in this way. This is what underlies the
keep_highest()
method at the top. While more efficient algorithms do exist for that specific problem (example), this is still fairly fast and reduces the amount of bespoke code.Variant: At least one score strictly greater than
This is the chance that A's array is strictly better than B's array (and vice versa).
We can either subtract off the chance that A and B will have exactly the same sorted array, or we can explicitly encode it into the evaluation function:
Denominator: 22452257707354557240087211123792674816
Miscellaneous notes
AnyDice is fine for Problem 1, but will have trouble with Problem 2 since, apart from a few built-in functions and operators, it appears to be based on exhaustive multiset enumeration.
You may also be interested in my interactive ability score calculator. While it doesn't solve this problem exactly it does give the distribution of raw and point-buy totals for a variety of ability score generation methods.