This is a follow-up to one of last month’s posts that contained some images of randomly-generated “lozenge tilings.” The focus here is not on the tilings themselves, but on the perfectly random sampling technique due to Propp and Wilson known as “coupling from the past” (see references below) that was used to generate them. As is often the case here, I don’t have any new mathematics to contribute; the objective of this post is just to go beyond the pseudo-code descriptions of the technique in most of the literature, and provide working Python code with a couple of different example applications.
The basic problem is this: suppose that we want to sample from some probability distribution
For example, consider shuffling a deck of cards using the following iterative procedure: select a random adjacent pair of cards, and flip a coin to decide whether to put them in ascending or descending order (assuming any convenient total ordering, e.g. first by rank, then by suit alphabetically). Repeat for some “large” number of iterations.
There are two problems with this approach:
- It’s approximate; the longer we iterate the chain, the closer we get to the steady state distribution… but (usually) no matter when we stop, the distribution of the resulting state is never exactly
- How many iterations are enough? For many Markov chains, the mixing time may be difficult to analyze or even estimate.
Coupling from the past
Coupling from the past can be used to address both of these problems. For a surprisingly large class of applications, it is possible to sample exactly from the stationary distribution of a chain, by iterating multiple realizations of the chain until they are “coupled,” without needing to know ahead of time when to stop.
First, suppose that we can express the random state transition behavior of the chain using a fixed, deterministic function
In other words, given a current state
Now consider a single infinite sequence of random draws
(Notice how the more random draws we generate, the farther back in time they are used. More on this shortly.)
A key observation is that if
But what if all initial states don’t end at the same final state? This is where the single source of randomness is key: we can simply look farther back in the past, say
So all we need to do to perfectly sample from the stationary distribution is
- Choose a sufficiently large
to look far enough into the past.
realizations of the chain, one for each initial state, all starting at time and running up to time zero.
- As long as the final state
is the same for all initial states, output as the sample. Otherwise, look farther back in the past ( ), generate additional random draws accordingly, and go back to step 2.
This doesn’t seem very helpful, since step 2 requires enumerating elements of the typically enormous state space. Fortunately, in many cases, including both lozenge tiling and card shuffling, it is possible to simplify the procedure by imposing a partial order
Then instead of running the chain for all possible initial states, we can just run two chains, starting with the minimum and maximum elements in the partial order. If those two chains are coupled by time zero, then they will also “squeeze” any other initial state into the same coupling.
Following is the resulting Python implementation:
import random def monotone_cftp(mc): """Return stationary distribution sample using monotone CFTP.""" updates = [mc.random_update()] while True: s0, s1 = mc.min_max() for u in updates: s0.update(u) s1.update(u) if s0 == s1: break updates = [mc.random_update() for k in range(len(updates) + 1)] + updates return s0
Sort of like the Fisher-Yates shuffle, this is one of those algorithms that is potentially easy to implement incorrectly. In particular, note that the list of “updates,” which is traversed in order during iteration of the two chains, is prepended with additional blocks of random draws as needed to start farther back in the past. The figure below shows the resulting behavior, mapping each output of the random number generator to the time at which it is used to update the chains.
Coming back once again to the card shuffling example, the code below uses coupling from the past with random adjacent transpositions to generate an exactly uniform random permutation
class Shuffle: def __init__(self, n, descending=False): self.p = list(range(n)) if descending: self.p.reverse() def min_max(self): n = len(self.p) return Shuffle(n, False), Shuffle(n, True) def __eq__(self, other): return self.p == other.p def random_update(self): return (random.randint(0, len(self.p) - 2), random.randint(0, 1)) def update(self, u): k, c = u if (c == 1) != (self.p[k] < self.p[k + 1]): self.p[k], self.p[k + 1] = self.p[k + 1], self.p[k] print(monotone_cftp(Shuffle(52)).p)
The minimum and maximum elements of the partial order are the identity and reversal permutations, as might be expected; but it’s an interesting puzzle to determine the actual partial order that these random adjacent transpositions preserve. Consider, for example, a similar process where we select a random adjacent pair of cards, then flip a coin to decide whether to swap them or not (vs. whether to put them in ascending or descending order). This process has the same uniform stationary distribution, but won’t work when applied to monotone coupling from the past.
Re-generating vs. storing random draws
One final note: although the above implementation is relatively easy to read, it may be prohibitively expensive to store all of the random draws as they are generated. The slightly more complex implementation below is identical in output behavior, but is more space-efficient by only storing “markers” in the random stream, re-generating the draws themselves as needed when looking farther back into the past.
import random def monotone_cftp(mc): """Return stationary distribution sample using monotone CFTP.""" updates = [(random.getstate(), 1)] while True: s0, s1 = mc.min_max() rng_next = None for rng, steps in updates: random.setstate(rng) for t in range(steps): u = mc.random_update() s0.update(u) s1.update(u) if rng_next is None: rng_next = random.getstate() if s0 == s1: break updates.insert(0, (rng_next, 2 ** len(updates))) random.setstate(rng_next) return s0
- Propp, J. and Wilson, D., Exact Sampling with Coupled Markov Chains and Applications to Statistical Mechanics, Random Structures and Algorithms, 9 1996, p. 223-252 [PDF]
- Wilson, D., Mixing times of lozenge tiling and card shuffling Markov chains, Annals of Applied Probability, 14(1) 2004, p. 274-325 [PDF]