This is a long-delayed follow-up to the previous post about a variant on the dice game Pig. In this “single turn” version of the game, a player repeatedly rolls a single die, accumulating points according to the numbers rolled, until either:

- Deciding to stop, with a score equal to the total of all rolls; or
- Rolling a 1, resulting in a “pig-out” score of zero, i.e., loss of all previous rolls.

If two players take their turn simultaneously without knowledge of the other player’s outcome, where the highest score wins, what is the optimal strategy for this game?

Before describing the solution, compare this game with the original version analyzed by Neller and Presser (see Reference (1) below), where the players *alternate* turns *repeatedly*, and the first player to reach 100 points wins. This turns out to be significantly more complicated, for two reasons. First, game states are repeatable– e.g., player 1 rolls a 1, then player 2 also rolls a 1, and we’re right back where we started– so a straightforward, “work backward from the end game” dynamic programming approach won’t work.

The second reason is that a player’s optimal strategy on a given turn cannot always be expressed as simply as “roll until you reach a total of *h* or greater.” For example, if you currently have a (banked) score of 41 and your opponent has 49, then optimal strategy is to roll until reaching a turn score of 22 or greater… *unless* you happen to roll a 6 reaching a score of 27, in which case you should risk rolling just one more time!

Our simultaneous, single-turn version of the game eliminates these complications, while still providing some interesting results.

**Solution:**

The analysis approach consists of (1) parameterizing all possible strategies, (2) computing the probability distribution of outcomes for a given strategy, and (3) solving the linear programming problem whose solution yields an optimal strategy for the game.

First, we need only consider strategies of the form, “hold at *h,*” i.e., roll until reaching a total of *h* or greater (why?). Given such a strategy, what is the probability distribution of the resulting score?

Johnson (2) describes a recursive algorithm for computing this distribution. However, I think generating functions can help a lot with the housekeeping here. Define to be the generating function for the outcome of a single “successful” roll (i.e., excluding a roll of 1):

or in Python (using Numpy with VPython):

from visual import * r = array([0, 0, 1./6, 1./6, 1./6, 1./6, 1./6])

(In all that follows, we could experiment with other variants, such as rolling two dice instead of one, simply by using the corresponding polynomial in place of .)

Then we can compute the total probability of scoring exactly *n* in *any* number of rolls as the coefficient of in the generating function given by

or in Python:

def pig_pgf(r, n_max=100): """Given single-roll pgf r(x), return g(x) = 1/(1-r(x)), where [x^n]g(x) is the total probability of scoring exactly n <= n_max in any number of rolls without pigging out. """ p = array([1]) # p(x) <- 1 p.resize(n_max + 1) g = array(p) # g(x) <- 0 g[0] = 0 for n in range(n_max): # collect g(x) = 1 + r(x) + r(x)^2 + ... g = g + p # g(x) <- g(x) + p(x) p = convolve(p, r) # p(x) <- p(x) * r(x) p = p[:n_max + 1] return g

(Note that and are not, strictly speaking, *probability* generating functions, since their coefficients do not sum to 1. That’s okay; we are using them to collect the probability of all possible *successful* outcomes, after which we can subtract from 1 to find the probability of “pigging out,” or rolling 1 at some point.)

At this point, for a given “hold at *h*” strategy, we can find the probability of ending a turn with a score of exactly *n* as the coefficient of in the generating function

The idea is that captures the possible values *k* of the *last* successful roll, which must have a value greater than (since otherwise we would keep rolling), and at most . The following Python function computes this distribution as a dictionary mapping scores to probabilities:

def pig_dist(h, r, g): """Return distribution p[n] = probability of score n using "hold at h" strategy, for the game with single-roll and total pgf r(x) and g(x), resp. """ r_max = len(r) - 1 p = {0: 1} for n in range(h, h + r_max): p[n] = 0 for k in range(1, r_max + 1): if n - h < k <= n: p[n] = p[n] + r[k] * g[n - k] p[0] = p[0] - p[n] return p

For example, using the “hold at *h*=20″ strategy, which corresponds to rolling as long as the expected net increase in score is positive, yields the following distribution of outcomes:

- P(0) = 0.62454083420125672
- P(20) = 0.099712986645624821
- P(21) = 0.094990627487340995
- P(22) = 0.074188848985588224
- P(23) = 0.054196084766465126
- P(24) = 0.035198091574370427
- P(25) = 0.01717252633935375

Armed with this function, if player 1 uses a “hold at ” strategy, and player 2 uses a “hold at ” strategy, we can compute the corresponding distributions of scores, and thus the overall distribution of outcomes. If we roll up that distribution into an expected zero-sum payoff, then the final step is to compute the resulting optimal mixed strategy (which by symmetry should be the same for both players).

The following plot shows the result. Optimal strategy is to roll to 21 approximately one-third of the time… but with a random smear of lower hold totals as well, including the possibility of just a single roll.

It’s interesting just how important the randomness of this mixed strategy is: against *any* *pure* “always hold at a fixed *h*” strategy, an optimal player can gain at least an 8% advantage, and even against the reasonable-sounding “hold at 20” strategy, an optimal player has a nearly 15% advantage… just by rolling a single time!

**References:**