Pig puzzle solution

This is a long-delayed follow-up to the previous post about a variant on the dice game Pig.  In this “single turn” version of the game, a player repeatedly rolls a single die, accumulating points according to the numbers rolled, until either:

  • Deciding to stop, with a score equal to the total of all rolls; or
  • Rolling a 1, resulting in a “pig-out” score of zero, i.e., loss of all previous rolls.

If two players take their turn simultaneously without knowledge of the other player’s outcome, where the highest score wins, what is the optimal strategy for this game?

Before describing the solution, compare this game with the original version analyzed by Neller and Presser (see Reference (1) below), where the players alternate turns repeatedly, and the first player to reach 100 points wins.  This turns out to be significantly more complicated, for two reasons.  First, game states are repeatable– e.g., player 1 rolls a 1, then player 2 also rolls a 1, and we’re right back where we started– so a straightforward, “work backward from the end game” dynamic programming approach won’t work.

The second reason is that a player’s optimal strategy on a given turn cannot always be expressed as simply as “roll until you reach a total of h or greater.”  For example, if you currently have a (banked) score of 41 and your opponent has 49, then optimal strategy is to roll until reaching a turn score of 22 or greater… unless you happen to roll a 6 reaching a score of 27, in which case you should risk rolling just one more time!

Our simultaneous, single-turn version of the game eliminates these complications, while still providing some interesting results.

Solution:

The analysis approach consists of (1) parameterizing all possible strategies, (2) computing the probability distribution of outcomes for a given strategy, and (3) solving the linear programming problem whose solution yields an optimal strategy for the game.

First, we need only consider strategies of the form, “hold at h,” i.e., roll until reaching a total of h or greater (why?).  Given such a strategy, what is the probability distribution of the resulting score?

Johnson (2) describes a recursive algorithm for computing this distribution.  However, I think generating functions can help a lot with the housekeeping here.  Define r(x) to be the generating function for the outcome of a single “successful” roll (i.e., excluding a roll of 1):

r(x) = \frac{1}{6} x^2 + \frac{1}{6} x^3 + \frac{1}{6} x^4 + \frac{1}{6} x^5 + \frac{1}{6} x^6

or in Python (using Numpy with VPython):

from visual import *

r = array([0, 0, 1./6, 1./6, 1./6, 1./6, 1./6])

(In all that follows, we could experiment with other variants, such as rolling two dice instead of one, simply by using the corresponding polynomial r(x)^2 in place of r(x).)

Then we can compute the total probability of scoring exactly n in any number of rolls as the coefficient of x^n in the generating function g(x) given by

g(x) = \frac{1}{1 - r(x)} = 1 + r(x) + r(x)^2 + r(x)^3 + ...

or in Python:

def pig_pgf(r, n_max=100):
    """Given single-roll pgf r(x), return g(x) = 1/(1-r(x)), where
    [x^n]g(x) is the total probability of scoring exactly n <= n_max in
    any number of rolls without pigging out.
    """
    p = array([1])         # p(x) <- 1
    p.resize(n_max + 1)
    g = array(p)           # g(x) <- 0
    g[0] = 0
    for n in range(n_max): # collect g(x) = 1 + r(x) + r(x)^2 + ...
        g = g + p          # g(x) <- g(x) + p(x)
        p = convolve(p, r) # p(x) <- p(x) * r(x)
        p = p[:n_max + 1]
    return g

(Note that r(x) and g(x) are not, strictly speaking, probability generating functions, since their coefficients do not sum to 1.  That’s okay; we are using them to collect the probability of all possible successful outcomes, after which we can subtract from 1 to find the probability of “pigging out,” or rolling 1 at some point.)

At this point, for a given “hold at h” strategy, we can find the probability of ending a turn with a score of exactly n as the coefficient of x^n in the generating function

g(x)r_{h,n}(x) = \frac{\sum\limits_{n-h < k \leq n} ([x^k]r(x)) x^k}{1 - r(x)}

The idea is that r_{h,n}(x) captures the possible values k of the last successful roll, which must have a value greater than n-h (since otherwise we would keep rolling), and at most n.  The following Python function computes this distribution as a dictionary mapping scores to probabilities:

def pig_dist(h, r, g):
    """Return distribution p[n] = probability of score n using "hold at
    h" strategy, for the game with single-roll and total pgf r(x) and
    g(x), resp.
    """
    r_max = len(r) - 1
    p = {0: 1}
    for n in range(h, h + r_max):
        p[n] = 0
        for k in range(1, r_max + 1):
            if n - h < k <= n:
                p[n] = p[n] + r[k] * g[n - k]
        p[0] = p[0] - p[n]
    return p

For example, using the “hold at h=20″ strategy, which corresponds to rolling as long as the expected net increase in score is positive, yields the following distribution of outcomes:

  • P(0) = 0.62454083420125672
  • P(20) = 0.099712986645624821
  • P(21) = 0.094990627487340995
  • P(22) = 0.074188848985588224
  • P(23) = 0.054196084766465126
  • P(24) = 0.035198091574370427
  • P(25) = 0.01717252633935375

Armed with this function, if player 1 uses a “hold at h_1” strategy, and player 2 uses a “hold at h_2” strategy, we can compute the corresponding distributions of scores, and thus the overall distribution of outcomes.  If we roll up that distribution into an expected zero-sum payoff, then the final step is to compute the resulting optimal mixed strategy (which by symmetry should be the same for both players).

The following plot shows the result.  Optimal strategy is to roll to 21 approximately one-third of the time… but with a random smear of lower hold totals as well, including the possibility of just a single roll.

Optimal mixed strategy for single-turn Pig.

Optimal mixed strategy for single-turn Pig.

It’s interesting just how important the randomness of this mixed strategy is: against any pure “always hold at a fixed h” strategy, an optimal player can gain at least an 8% advantage, and even against the reasonable-sounding “hold at 20” strategy, an optimal player has a nearly 15% advantage… just by rolling a single time!

References:

  1. T. Neller and C. Presser, “Optimal Play of the Dice Game Pig,” The UMAP Journal, 25:1 (2004) 25–47 [PDF]
  2. R. Johnson, “A Simple ‘Pig’ Game,” Teaching Statistics, 30:1 (2008) 14-16 [HTML]
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s