## Analysis of Chutes and Ladders

Introduction

A friend of mine has been playing the board game Chutes and Ladders recently with his son.  The game is pretty simple: starting off of the board (effectively on “square zero”), each player in turn rolls a die (actually, spins a spinner) and– if possible– moves forward the indicated number of squares in serpentine fashion on the board shown below:

Chutes and Ladders board layout. In some variants, the chute from square 48 to 26 starts at square 47 instead.

If a player lands on the tail of an arrow, he continues along the arrow to the indicated square, ending his turn there.  (“Chutes” are arrows leading backward, and “ladders” are arrows leading forward.)  The first player to land exactly on square 100 wins the game.

My friend emailed me with the question, “What is the expected number of turns before the game is over?”  I think this is a nice problem for students, since it yields to the usual two-pronged attack of (1) computer simulation and (2) “cocktail napkin” derivation of an exact solution.

This is certainly not a new problem.  See the references at the end of this post for just a few past analyses of the game as a Markov chain.  My goal here is to provide an intuitive derivation of the exact expected length of the game as a solution to a system of linear equations, while staying away from the more sophisticated machinery of Markov chains, fundamental matrices, etc., that younger students may not be familiar with.

Unbounded vs. infinite

Before getting to the exact solution, though, it is worth addressing a comment in the referenced DataGenetics blog post about Monte Carlo simulation of the game:

“Whilst the chances of a game running and running are essentially negligible (constantly landing on snakes [i.e., chutes] and going around in circles), there is a theoretical chance that the main game loop could be executing forever. This is bad, and will lock-up your code. Any smart developer implementing an algorithm like this would program a counter that is incremented on each dice roll. This counter would be checked, and if it exceeds a defined threshold, the loop should be exited.”

I strongly disagree with this.  The valid concern is that there is no fixed bound on how many turns a particular simulated game might take; that is, for any chosen positive integer $r$, there is a positive probability that the game will require at least $r$ turns, due to repeatedly falling backward down chutes.  However, the probability is zero that the game will execute forever; we can be certain that the game will always eventually terminate.  There is a difference between unbounded and infinite execution time.  In fact, such explicit “safety clamping” of the allowed number of moves implicitly changes the answer, modifying the resulting probability distribution (and thus the expected value) of the number of moves, albeit by an admittedly small amount.

There are situations where games can take an infinitely long time to finish– Chutes and Ladders just isn’t one of them.  For example, in this multi-round card game, everything is fine as long as $p > 1/2$.  And even when $p = 1/2$, the game is still guaranteed to finish, although the expected number of rounds is infinite.  But when $p < 1/2$, there is a positive probability that the game never finishes, and indeed continues “forever.”

Recursive expected values

To see how to analyze Chutes and Ladders, first consider the following simpler problem: how many times should you expect to flip a fair coin until it first comes up heads?  Let $x$ be this desired expected value.  Then considering the two possible outcomes of the first flip, either:

• It comes up heads, in which case we are done after a single flip; or
• It comes up tails… in which case we are right back where we started, and so should expect to need $x$ additional flips (plus the one we just “spent”).

That is,

$x = \frac{1}{2}(1) + \frac{1}{2}(1 + x)$

Solving yields $x=2$.  More generally, if P(success) is $p$, the expected number of trials until the first success is $1/p$.

Solution

We can apply this same idea to Chutes and Ladders.  Let $x_i$ be the expected number of turns needed to finish the game (i.e., reach square 100), starting from square $i$, where $0 \leq i \leq 100$ (so that $x_0$ is the value we really want).  Then

$x_i = 1 + \frac{1}{6}\sum_{j=1}^6 x_{f(i,j)}, i < 100$

$x_{100} = 0$

where $f(i,j)$ is the number of the square reached from square $i$ by rolling $j$– which is usually just equal to $i+j$, but this function also “encodes” the configuration of chutes and ladders on the board, as well as the requirement of landing exactly on square 100 to end the game.

So we have a system of 100 linear equations in 100 unknowns, which we can now talk the computer into solving for us (in Python):

import numpy as np

n, m = 100, 6
chutes = {1: 38, 4: 14, 9: 31, 16: 6, 21: 42, 28: 84, 36: 44,
48: 26, 49: 11, 51: 67, 56: 53, 62: 19, 64: 60, 71: 91,
80: 100, 87: 24, 93: 73, 95: 75, 98: 78}

A = np.eye(n)
b = np.ones(n)
for i in range(n):
for j in range(1, m + 1):
k = i if i + j > n else chutes.get(i + j, i + j)
if k < n:
A[i][k] -= 1.0 / m
x = np.linalg.solve(A, b)


Results

The following figure shows all of the resulting values $x_i$.  For example, the expected number of turns to complete the game from the start is $x_0 = 39.5984$.

Expected number of (additional) turns to finish game starting from the given square.

Note that we can easily analyze the “house rule” used in the DataGenetics post– allowing a player to win even when “overshooting” square 100– simply by changing the k=i in line 12 to k=n.  The result is a modest reduction in game length, to about 36.1931 turns.

References:

1. Althoen, S. C., King, L., and Schilling, K., How Long Is a Game of Snakes and Ladders?  The Mathematical Gazette, 77(478) March 1993, p. 71-76 [JSTOR]
2. DataGenetics blog, Mathematical Analysis of Chutes and Ladders [HTML]
3. Hochman, M., Chutes and Ladders [PDF]
Posted in Uncategorized | 4 Comments

## Searching for chocolates

The following problem occurred to me, with Valentine’s Day approaching next week:

A couple of years ago, I bought a box of chocolates for Valentine’s Day.  The box contained n chocolates, indistinguishable from the outside, but the inside of each containing one of n different fillings (e.g., cherry, caramel, etc.).  On the cover of the box was a map, with labels indicating the type of each chocolate in the corresponding position.  Using the map, it was easy for me to find my favorite type of chocolate: I just find it on the map, pick it out of the box, and eat that one single chocolate.

Last year, I bought another box of chocolates.  It was the same box, containing the same n different types of chocolate… but the map was no longer printed on the box cover.  So to find my single favorite type of chocolate, I was forced to eat chocolates one at a time until I found the one I was looking for.  With no information about which chocolate was which, I expected to have to eat (n+1)/2 chocolates on average to find my favorite, by simply picking one after the other at random.

This year, I have bought another box of chocolates.  Again, it is the same box, with the same n types of chocolate, and this time the map is printed on the box cover… but as a puzzle, my wife secretly opened the box and rearranged all of the chocolates, so that all of the labels on the map are incorrect.  Now what is the optimal strategy for finding my favorite type of chocolate, and what is the corresponding (minimized) expected number of chocolates that I have to eat?

Posted in Uncategorized | 4 Comments

## One-card draw poker

Introduction

Let’s play another simple card game: after a \$1 ante, you (the dealer) shuffle a standard 52-card poker deck, and deal one card face down to each of us.  After looking at our respective cards, we each have the option either to “stay” and keep our card, or to “switch,” discarding and drawing a new card from the top of the deck.  The player with the highest-ranked card (ace low through king high) wins the pot, where a tie splits the pot.  What is the optimal strategy and expected value of playing this game?

(Note that although all of the games discussed here are playable– and are even more interesting and complex– with three or more players, I am focusing on just two players here.)

One-card Guts

There are two interesting variations on these basic rules.  First, suppose that we declare our choice of whether to stay or switch simultaneously, as in the game Guts: each player takes a chip or bottle cap under the table, either places it in a closed fist (to stay) or not (to switch), then holds the closed fist over the table.  All players simultaneously open their hands palm-down to indicate their choice.

This is the simpler version of the game.  With two players, it can be shown that the optimal strategy for both players is to stay if they are dealt a nine or higher.  That the game is fair (i.e., the expected value is zero) is clear from symmetry; no one player is “preferred” or distinguished in any way by the rules of the game.

One-card draw poker

Now suppose instead that we play more like a typical draw poker game: after the initial deal, each player in turn decides whether to discard and draw a new card.  Remember that you are the dealer, so you get to make your decision after I have made mine.  Can you exploit this advantage?

This is the version of the game that motivated this post.  I came up with this while trying to find examples of games for a student, games that were “small” enough to be tractable to analyze and experiment with via simulation, but still with interesting strategic complexity.  It’s worth noting that the name “One-Card Poker” is also used to refer to a similar “stud” form of the game, where the card play is simpler (no discarding), but the betting rounds are as in normal poker, resulting in a game that is much more complex to analyze.

(One final note: use of an actual 52-card deck complicates the analysis slightly, in a mostly non-interesting way.  I find it convenient to present the game using an “infinite deck,” where the probability distribution of card ranks remaining in the deck does not change as cards are dealt.)

## Calories in, calories out revisited

“All models are wrong, but some are useful.” George E. P. Box

A couple of months ago, I wrote about my experience “counting calories,” particularly about the accuracy of a very simple model of daily weight changes as a recurrence relation, converting each day’s net calorie deficit (or excess) into a 3500 calorie-per-pound weight loss (or gain).  The resulting predictions agreed very closely with my actual weight loss… however, I raised some additional questions at the end of the post, and the post itself generated some interesting comments as well.  Since it’s New Year’s resolution time, I thought this would be an appropriate time to follow up on these.

The following figure shows my predicted weight loss (in blue) over a period of 136 days, compared with each day’s actual measured weight (in red).  The details of the predictive model are provided in the earlier post; briefly, the idea is simple: start with a “zero day” measured weight, then predict weight changes for all subsequent days using only daily calorie intake (from eating) and expenditure (based on weight and, in my case, running).

Predicted and actual measured weight over 136 days.

For side-by-side comparison, the following figure shows the corresponding estimated daily calorie intake over the same time period.

Daily calorie intake over the same 136 days.

The first 75 days (the focus of the earlier post) show reasonably consistent behavior: relatively aggressive calorie deficits that yield almost 2.5 pounds lost each week.  But what if I were not so consistent?  How well does this simple model handle more radical changes in diet?

As you can see from both figures, days 76 through 79 get messy.  During that long weekend I had to travel to give a talk, and thus I didn’t have ready access to a scale, hence the missing weight measurements; and I also had less convenient control over the food that I ate.  There is a banquet buffet lunch in there, some restaurant food, etc., where my best-effort calorie estimates are obviously much less accurate.

But although I certainly expected to have gained weight after returning from my trip, I was surprised at how much I had gained, much more than the predictive model could possibly account for.  However, over the next several weeks, when I returned to a more well-behaved diet (more on this later), my weight seemed to “calm down,” returning to reasonably close agreement with the simple “calories in, calories out” model.  It’s not clear to me what causes wild swings like this.  For example, there are also a couple of 4 am wake-up calls for early flights, late nights, and the general stress of travel during those four days.  Perhaps those departures from my normally routine lifestyle might also contribute to the fluctuation in some way?

Also, note the planned incremental increases in calorie intake over the last couple of months, and the resulting slowdown in the rate of weight loss.  I didn’t stop losing weight, I just started losing less weight each week as I approached my goal.  This ability to eat more while still losing weight may be counter-intuitive, but the math makes sense: it’s really hard to lose weight… but it’s much easier to maintain weight once you’re where you want to be.  (On the other hand, it’s an exercise for the reader to verify using the model that it’s also dangerously easy to gain weight, and to do so much more quickly than you lost it.)

Finally, this subject generated quite a bit of discussion about the initial question of whether “it’s really as simple as calories in, calories out.”  In particular, several commenters insisted that no, it is not that simple, that the human body “is not a bomb calorimeter,” but a much more complex machine where weight is influenced by many other factors, including genetic variation, gut flora, etc.

I don’t disagree with this.  In fact, while we’re at it, let’s point out several other limitations as well: this model treats calorie expenditure as a linear function of total body weight, instead of lean body mass, which is arguably a better fit (but is not nearly as convenient to actually measure).  It also treats calorie expenditure from running as a function of weight and distance, but not speed.

Which brings me to the quotation at the top of this post.  No, this simple recurrence relation does not reflect the full complexity of the biological processes occurring in the human body that contribute to weight loss or gain… but so what?  Don’t use a complicated model when a simple one will do.  In this case, most of that additional complexity is “rolled up” into the single coefficient $\alpha$ reflecting the individual’s “burn rate” based on gender, genetic variation, flora in the gut, etc.  Granted, that coefficient may be unknown ahead of time, but at worst it can be estimated using a procedure similar to that described in the original post.

## Pick any two cards

Let’s play a game: thoroughly shuffle a standard pack of 52 playing cards.  Now name any two card ranks, e.g., seven and jack.  If a seven and a jack happen to appear next to each other in the deck, you pay me a dollar, otherwise I pay you a dollar.  Should you be willing to play this game?  How likely do you think it is that I will win?

This game is discussed in a Grey Matters blog post, which in turn refers to a “Scam School” video describing the game as a bar trick.  I think it’s interesting because it’s not intuitive– it doesn’t seem very likely at all that an arbitrary pre-selected pair of ranks, just 4 cards each, should appear adjacent in a random shuffle of the 52 total cards in the deck.

But it’s also interesting because it’s not as likely as the Scam School video seems to suggest.  It turns out to be slightly worse than a coin flip, where the probability that two given card ranks appear adjacent is 284622747/585307450, or about 0.486.  Not very useful as a bar bet, I guess.

So, to improve the odds a bit, consider the following slight variation: shuffle the deck, and name any single card rank that is not a face card (i.e., not a jack, queen, or king).  If a card of the named rank appears next to a face card somewhere in the deck, you pay me a dollar.  Now what is the probability that I win?  It turns out this is a much better bet, with a probability of about 0.886 of such an adjacency.

Finally, we don’t have to resort to simulation as in the Grey Matters post.  It’s a nice puzzle to compute the probability in general, where $n=a+b+c$ is the total number of cards in the deck, of which $a$ are of rank A and $b$ are of rank B.  (For example, in the original problem, $n=52$ and $a=b=4$; in the second variant, $a=4$ and $b=12$.)  Then the probability of at least one A/B adjacency is

$1-\frac{1}{{n \choose a,b,c}} \sum_{k=0}^{a-1} {a-1 \choose k} {n-2a+k \choose b} ({c-1 \choose a-k} + 2{c-1 \choose a-k-1} + {c-1 \choose a-k-2})$

the derivation of which I think is a nice puzzle in itself.

## Card shuffling algorithms, good and bad

Introduction

This is motivated by a recent interesting post on the DataGenetics blog about various algorithms for randomly shuffling a list of elements, e.g. a deck of playing cards.  There is a similar article from 2007 at Coding Horror, and I am sure there are others as well.  The gist is that (1) a naive first attempt at implementing a shuffling algorithm is likely to be biased, yielding a non-uniform distribution of possible arrangements; but (2) fortunately, there is an equally simple algorithm that solves the problem efficiently and exactly, known as the Fisher-Yates shuffle.

This is well-trodden ground; my intent here is to expand on that basic story a bit, by digging into some of the mathematics relating these ideas about computer shuffling algorithms to the way we humans shuffle cards, with the result that some of these “broken” naive shuffling algorithms are potentially fixable.

Repeated shuffling

First, how do humans shuffle cards?  There are many different methods, but I will focus here on the riffle shuffle, where the deck is split into two halves, which are then interleaved together.  The randomness arises from the imperfection in:

1. The cut of the deck into two not necessarily equal “halves.”  This can be modeled by cutting at a position selected randomly from a binomial distribution.
2. The interleaving of the two halves, where “clumps” of cards from a single half stay together.  This can be modeled by letting cards fall into a single pile alternately from the two halves, with the probability of the next card to fall being proportional to the current number of cards remaining in the two halves.

Of course, it is not sufficient to perform this shuffle just once.  For one, there are only $2^{52}$ possible permutations resulting from a single riffle shuffle of a standard 52-card poker deck (why?).  This is much less than the $52!$ total possible arrangements.  So all arrangements are not even possible with a single shuffle, let alone equally likely.

However, we can fix this problem by simple repetition– that is, just shuffle multiple times.  With more and more shuffles, eventually all arrangements become possible, and with each additional shuffle, the probability distribution of arrangements more closely approximates a uniform distribution.  Granted, it is an interesting question how many such shuffles are enough.  But in practice it turns out that seven is usually “good enough,” and in principle we can keep shuffling to get as close as desired to a uniform distribution.

So why does this work?  We started with a limited set of $2^{52}$ possible permutations corresponding to a single shuffle, and by repeatedly applying a randomly chosen permutation from that limited set, we were able to arbitrarily closely approximate a uniform distribution over all $52!$ possible permutations.  Under what conditions can we do this?  My goal in this post is to answer this question.

Computer algorithms

On a computer, shuffling can be done more efficiently.  The following Python code implements the Fisher-Yates shuffle of a list of elements, in-place, using a linear number of swaps, with each permutation being exactly equally likely:

import random

def shuffle(deck):
"""Fisher-Yates shuffle."""
for i in range(len(deck) - 1, 0, -1):
j = random.randrange(i + 1)
deck[i], deck[j] = deck[j], deck[i]


(Note that I am ignoring the important issue of integrity of the underlying random number generator, state space size, etc.  For this discussion, let us assume that we can generate random integers uniformly distributed in any given range.)

Compare with the following “naive” algorithm, which is simple to describe– for each position in the deck, swap the card in that position with a randomly chosen card:

import random

def shuffle(deck):
"""Naive shuffling algorithm."""
for i in range(len(deck)):
j = random.randrange(range(len(deck)))
deck[i], deck[j] = deck[j], deck[i]


The problem with this algorithm, as discussed in the earlier linked blog posts, is that it is biased.  That is, the resulting distribution of possible permutations is not uniform; some permutations end up being more likely than others, as shown in the following figure.

Probability distribution of 5!=120 possible permutations after naive shuffle of a 5-card deck.

Each bar represents one of the 5!=120 possible permutations of a 5-card deck, arranged lexicographically from the identity permutation on the left to the “reverse” permutation on the right.  The height of each bar is proportional to the corresponding probability.  The grid line indicates what the probability of each permutation should be (1/120) when the deck is properly shuffled.

Condition 1: Reachability

Another nice visualization of this bias is a heat map, with each color in an n-by-n matrix indicating the probability that a card in position $i$ ends up in position $j$.  To properly shuffle the deck, this probability should be $1/n$ for all $(i,j)$.  The DataGenetics post (near the end, I can’t link directly to the image) as well as Mike Bostock both provide good examples of this.

However, although uniformity of card positions is necessary, it is not sufficient to properly shuffle the deck.  To see why, consider the “shuffle” consisting of simply cutting the deck (and completing the cut) at a uniformly random position.  Each card is equally likely to end up in any given position, so the corresponding heat map would not show any bias.  But there are only $n$ possible cyclic permutations that result, and repetition doesn’t help: we can’t “reach” any additional permutations by repeated cuts.

This “reachability” is the first of two important criteria for a proper shuffling method: the set of permutations from which we can choose for a single shuffle should generate all possible permutations.  That is, it must be possible by repeated shuffles to eventually realize every possible permutation of cards in the deck.

So just cutting the deck fails this condition.  But going back now to the biased naive algorithm above, it does generate all possible permutations… and so it turns out that we can “fix” the bias by repeated applications of shuffle(deck), yielding closer and closer approximations of a perfectly uniform distribution of possible deck arrangements, as shown below:

Probability distribution of permutations after successive naive shuffles.

Condition 2: Spanning cosets

We were able to “fix” the naive shuffling algorithm by applying it repeatedly… but we got slightly lucky in doing so, because this isn’t always possible.  To see why, consider the following even simpler shuffle, consisting of swapping a single randomly selected pair of cards:

import random

def shuffle(deck):
"""Swap a random pair of cards."""
i, j = random.sample(range(len(deck)), 2)
deck[i], deck[j] = deck[j], deck[i]


This obviously requires repeated application, since a single “shuffle” only changes the positions of two cards.  But this simple shuffle does satisfy the reachability condition: every permutation may be represented as a product of transpositions, so with enough random swaps, we can eventually realize every possible permutation.  So at first glance, it certainly seems like this algorithm should work just fine… if rather slowly.

But there is a problem.  Let’s again consider the small, manageable case of just $n=5$ cards, and look at the probability distribution over the $n!=120$ possible permutations as it evolves with each additional call to shuffle(deck)– that is, with each additional random swap of a pair of cards:

Distribution of 5!=120 permutations after successive swaps of random pairs of cards.

At the start, with 0 swaps, all of the probability is concentrated at the identity permutation at the far left.  With more and more swaps, the probability “spreads out” among the rest of the possible permutations.

Can you see the problem?  To make it more vivid, let’s re-arrange the bars in a different order:

The same evolution of probability distributions, but with permutations grouped by parity.

With each successive random swap, the probability distribution does become more and more uniform… but it alternates back and forth between two halves of the set of possible arrangements.  The problem is that permutations have parity.  Although a given permutation can be represented as a product of transpositions in many different ways, the parity (even-ness or odd-ness) of the number of such transpositions is invariant.  Before we begin shuffling, all of the probability is concentrated at the identity permutation, which is even (since it involves zero transpositions).  After the first call to shuffle(deck), the randomly chosen transposition yields an odd permutation (in the right half of the figure); after the second transposition, the resulting product is even; after the third, it jumps back to odd, etc.

This periodic behavior happens because all of the permutations in the generating set– that is, all single transpositions– are odd.  We would have a similar problem if all of the permutations in the generating set were even, except that instead of jumping back and forth as shown above, the distribution would always be confined to the even half of the permutations.

More complex periodic behavior is possible as well.  In general, the second condition needed for a proper shuffle is that the generating set of single shuffles must not all lie in some coset of some subgroup of the symmetry group of all possible permutations.  In the random swap algorithm above, this condition is violated because the set of all single transpositions lies within the single coset of the alternating group.

To summarize, a biased or incomplete shuffle isn’t necessarily hopeless.  It is possible to arbitrarily closely approximate a uniform distribution of permutations of elements of a list by repeated shuffling, as long as (1) every permutation is “reachable,” and (2) the generating set of permutations from a single shuffle do not all lie in the same coset of some subgroup.

Posted in Uncategorized | 2 Comments

## Probabilities in Knockout: solution

This is a very brief follow-up to the previous post about the basketball game Knockout, and the advantage of starting the game at a particular position in line.  Specifically, if we start the game with $n$ equally skilled players, each of whom makes their initial shot (usually from the free throw line) with some fixed probability $p$, and any follow-on rebound shots with probability $q$, then what is the probability of winning the game as a function of starting position?

Using the state transition diagram from last time, let $P_{n,k,s}$ be the probability that, with $n$ players remaining in the game currently in state $s \in \left\{1,2,3,4,5\right\}$, the player in position $k$ (starting from zero) wins.  Then we can translate the state diagram into a corresponding system of equations using the following Mathematica code:

numPlayers = 2;
eq = {
P[1, 0, 1] == 1,
Table[
{
P[n, k, 1] == p P[n, Mod[k - 1, n], 1] + (1 - p) P[n, k, 2],
P[n, k, 2] == p If[k == 0, 0,
P[n - 1, Mod[k - 2, n - 1], 1]] + (1 - p) P[n, k, 3],
P[n, k, 3] == q P[n, Mod[k - 1, n], 4] + (1 - q) P[n, k, 5],
P[n, k, 4] == q P[n, Mod[k - 1, n], 1] + (1 - q) P[n, k, 2],
P[n, k, 5] == q If[k == 0, 0,
P[n - 1, Mod[k - 2, n - 1], 1]] + (1 - q) P[n, k, 3]
},
{n, 2, numPlayers},
{k, 0, n - 1}
]
} // Flatten;


Then we can determine the probabilities of winning for each position in line by solving for $P_{n,k,1}$ for $k=\left\{0,1,...,n-1\right\}$.

Table[
P[numPlayers, k, 1],
{k, 0, numPlayers - 1}
] /. Solve[eq, Cases[eq, _P, {2}]] // First;


Interestingly, in the game with just $n=2$ players, the probability of the first player winning is $1/(3-p)$, independent of $q$, as shown in the following figure.  The second player’s advantage is minimized when the initial shot is easier.

Probability of first player (red) and second player (blue) winning in two-player Knockout, vs. probability of making the initial shot from the free throw line.

With more than two players, the probabilities of winning depend on both $p$ and $q$ in a more complicated way.  If we make the reasonable assumption that rebounds and put-backs are about the same difficulty, say $q=0.8$, no matter from where the initial shot is taken, then we can show similar results for $n=3$ and $n=4$, etc.

Probabilities of winning with 3 players, vs. p (with q=0.8).

Probabilities of winning with 4 players, vs. p (with q=0.8).

As mentioned last time, my intuition suggested that it’s always better to start farther back in line, but the above figures show that’s not necessarily the case, at least if the initial shot is sufficiently difficult– for example, if players start by shooting from the three-point line instead of the free throw line.

Posted in Uncategorized | 2 Comments