A programming student was recently interested in simulating and analyzing the dice game Pig. While working out some of the relevant probabilities and expected values for the original game, I came up with the following twist, that I think is even simpler to describe, but still presents some interesting complexity in game play and analysis:

As in Pig, you and another player each take a turn rolling a single die, repeatedly as many times as you wish, until either:

- You decide to stop, at which point your score is the sum of all of your rolls in the turn; or
- You roll a 1, at which point your turn ends with a score of zero (i.e., you lose all of your rolls in the turn).

Rather than alternating turns with the winner being the first to reach 100 points, in this variant you each take just a *single* turn, *simultaneously* and separately, i.e., without knowledge of the other player’s outcome. The highest score wins.

What is your optimal strategy for playing this game? And for any given strategy, what is the probability distribution of outcomes (turn totals)?

### Like this:

Like Loading...

*Related*

Pig is fascinating. And somehow I never heard of it until now. That’s a whole lot of complexity for such simple rules!

Still regarding the original rules, there are less than 1 million possible game states. Ignoring that the game has already been solved, I wonder how useful this relatively low number would be in developing an AI strategy — through a neural network or genetic algorithm.

I think your variation greatly simplifies optimal play. As Neller and Presser’s paper says, “risking points is not the same as risking the probability of winning.” Without knowledge of the other player, the second part disappears. The scores (i, j) are no longer being considered and the game state comes down to a single value: the current total (k). I *think* that would make “hold at 20,” where the odds break even, the optimal strategy.

My simulation says if you’re holding at 20 or higher, you’ll win against people holding at 10 to 19 or 27 and higher, and lose against people holding at 1-9 or 21 to 26.

Why do you believe the odds break even at 20?

My first idea was that if you do 3.8 throws, you’ll have a 50% chance of getting a zero, and that makes the expected value be 3.8*3.5=13.3, so “hold at 14″ ought to be good, but that suffers from the same problem.

Hold at 15 can only be beat by a “hold at” strategy with 16 to 23, but of course if you play a couple of rounds, you can deduce the value your opponent is holding at and hold at slightly more than she does (or, if she’s holding at 19 or higher, only throw once and let her zero herself out), so a fixed “hold at” strategy is bound to lose in the long run.

It’s not as simple as that.

Hmm, since 1 is never counted, the expected value of a successful throw is not 3.5, but rather 4.0, so my “hold of thumb” would be at 15.2, which is actually not all that bad.

Pitting a “hold at” strategy against “k throws” actually looks very similar to “hold vs. hold” or even “k throws against l throws”, with using slightly higher values than the opponent being a good strategy (or adopting a 1 throw strategy if the opponent holds at over 16 or makes more than 4 throws).

I’m getting the break even at 20 from “Dice Games Properly Explained,” Reiner Knizia. (Hopefully the following blockquote works right.)

There is only one minor correction to your given simulation results– holding at 20 will lose against holding at 27… although it’s a very close call, with expected return approximately -0.114% (or exactly -1349917132954549/1184595334580404224).

Right– as you put it, “it’s not as simple as that” :). Your key observation is that any fixed “hold at” strategy can be beaten. And by symmetry of the rules (i.e., there is no first player advantage), the game *should* be fair, i.e. have expected return 0. So the interesting question is, what is the optimal strategy that realizes that expected return even against a similarly optimal opponent?

The mathematician in me particularly enjoys the problem of computing the distribution of roll totals for a given strategy (from which optimal strategy can be derived; I’ll follow up with details in another post). But the computer scientist in me also likes the idea of couching this problem as one of pitting AIs against each other in a tournament of Monte Carlo iterations of games.

@Christopher Wellons: I understand the explanation. The problem with it is that it maximizes expected point value, and should be a good strategy for the original game (except maybe for the endgame?), but in the PossiblyWrong version it matters not how many points you have in a round, but that you have more points than the opponent – we don’t care by what margin. So while a strategy of “roll only once” will lose against your strategy in “Pig” for lack of progress towards 100, it will win if you only play one round, since it goes to zero much less often.

@possiblywrong: I ran a simulation for n=100023, and it came to 0.501: apparently my n was too small.

“So the interesting question is, what is the optimal strategy that realizes that expected return even against a similarly optimal opponent?” — I ought to convert my table (with rows and columns standing for “hold at k”) from win/(win+lose) to (win-lose)/n to get expected return, like you do, multiply each row with a coefficient a_k with sum(a_i)=1, and then solve for column sums being equal to 0 – this is a simple system of linear equations, and will have a solution if I drop the last column (to insert the sum=1 equation). The result for 50 rows and columns should be close enough to the optimal strategy, since for k>50, you’ll just zero out most of the time anyway. The optimal strategy would then be “with probability a_k, hold at k”. I wouldn’t be surprised if the probability distribution peaks near 15, but I’m too lazy to do the computations right now. :-P

You’re right, the state space for the original “play to 100″ game is relatively small. The value iteration approach described by Neller and Presser is actually slightly more machinery than is needed here; you can simply use iterative substitution– i.e., start with “guess” values of P(i,j,k), and update the guesses by evaluating the recurrence relations. The C++ code here will do the trick. (The interesting part is characterizing the situations where this is sort of thing is safe to do, which it happens to be for Pig.)

In fact, we could even do better than this, if we restricted our strategies to those of the form “hold at k”, even if k is a function of (i,j). With a “nested” approach of simple dynamic programming on the outside, with minimax on the inside, you can compute optimal strategy and *exact* probabilities of winning.

The interesting challenge, though, is that optimal strategy for Pig is *not* actually of the form “hold at k” in general. That is, if you look at the 3D “boundary surface” plots in Neller and Presser’s paper, there are “hooks” in the surface, so that optimality of rolling with total k does not imply optimality of rolling with all totals *less* than k.

Pingback: Pig puzzle solution | Possibly Wrong

Pingback: The Price Is Right… sort of | Possibly Wrong