A wrong solution to the Riddler sock matching problem

From last week’s FiveThirtyEight.com Riddler column:

From Anna Kómár comes a stumper about socks: In my laundry basket, I have 14 pairs of socks that I need to pair up. To do this, I use a chair that can fit nine socks, at most. I randomly draw one clean sock at a time from the basket. If its matching counterpart is not already on the chair, then I place it in one of the nine spots. But if its counterpart is already on the chair, then I remove it from the chair (making that spot once again unoccupied) and place the folded pair in my drawer. What is the probability I can fold all 14 pairs without ever running out of room on my chair?

This week’s column describes several solution approaches, including Monte Carlo simulation and recurrence relations. But here is another candidate approach that is potentially more efficient to compute:

Associate each possible sequence of sock drawings with a north-east lattice path, from (0,0) when the basket is full, to (14,14) when the basket is empty, with a north step corresponding to drawing an unmatched sock and placing it on the chair, and an east step corresponding to drawing a sock whose counterpart is already on the chair, removing, folding, and placing the pair in the drawer.

Valid drawing sequences are represented by such a lattice path that never touches the diagonal y=x-1. (That is, we can’t have a negative number of socks on the chair.) There are {28 \choose 14}-{28 \choose 13}=2674440 of these paths (more on this later); of these, the number of successful drawing sequences, never running out of room on the chair, by also never touching the diagonal y=x+10, is

{28 \choose 14}-{28 \choose 13}-{28 \choose 4}+2{28 \choose 3}-{28 \choose 2}=2660139

Thus, the desired probability is 2660139/2674440, or about 0.994653.

This is effectively the argument made in Gilliand [1] and Pantic [2]. However, as pointed out by Paulsen [3], this incorrectly assumes that each distinct lattice path is equally likely, when in fact we are more likely to encounter paths requiring more room on the chair.

For this argument to work, we need a different problem: instead of 14 distinct (colors of) pairs of socks, suppose that we are folding 14 pairs of white athletic socks, essentially identical except for an “L” stitched on each left sock and “R” stitched on each right sock. Now, any left sock may be matched with any right sock and vice versa.

In this modified problem, valid drawing sequences are now all possible north-east lattice paths, and successful drawing sequences are paths strictly between the diagonals y=x+10 and y=x-10. (That is, multiple socks on the chair must be all left or all right.) This does work, since each equally likely drawing sequence corresponds to a distinct lattice path. (The Riddler column does note that modeling the original problem using lattice paths is “tricky, since the probabilities of transitioning from one state to the next depended on how many paired and unpaired socks remained in the basket.” This is true… but that’s not really why the lattice path calculation doesn’t work in that case– even in this modified problem, state transition probabilities also depend on the composition of socks remaining in the basket.)

This version of the problem was discussed previously here; the motivation for this post is really just to collect my notes on the general solution to these counting problems (incorrectly applied in the former, correctly in the latter).

Theoerem: Given positive integers such that a-d<b<a+u, the number of north-east lattice paths from (0,0) to (a,b) that do not intersect the diagonals y=x+u nor y=x-d is

\sum\limits_{k=\lceil-\frac{a}{u+d}\rceil}^{\lfloor\frac{b}{u+d}\rfloor}{a+b \choose a+k(u+d)} - \sum\limits_{k=\lceil-\frac{b-d}{u+d}\rceil}^{\lfloor\frac{a+d}{u+d}\rfloor}{a+b \choose a+u+k(u+d)}

Proof: First, reviewing the standard ballot-counting reflection argument: given a path to (a,b) that does first intersect diagonal y=x+u at (k,k+u), reflect the remainder of this path about that diagonal, so that the endpoint (a,b) is reflected to (b-u,a+u). This establishes a bijection between the complementary set of diagonal-intersecting paths to (a,b) and the set of all paths to (b-u,a+u).

We use this same technique here, but with repeated reflections: starting with all (unconstrained) paths to (a,b), we subtract those invalid paths that first intersect diagonal y=x+u… but we must add back paths that subsequently also intersect diagonal y=x-d, via a second reflection, etc. Similarly, we must also subtract invalid paths that first intersect diagonal y=x-d, then add back paths that subsequently intersect diagonal y=x+u, etc.

Grouping inclusion-exclusion terms by the number j of initial diagonal intersections yields

{a+b \choose a} + \sum\limits_{j \geq 1}(-1)^j \left({a+b \choose a+\lceil\frac{j}{2}\rceil u + \lfloor\frac{j}{2}\rfloor d} + {a+b \choose a-\lfloor\frac{j}{2}\rfloor u - \lceil\frac{j}{2}\rceil d}\right)

the terms of which may be rearranged to yield the difference of sums above.

References:

  1. Gilliand, S., Johnson, C., Rush, S., and Wood, D., The sock matching problem, Involve7(5) 2014, p. 691-697. [PDF]
  2. Pantic, B. and Bodroza-Pantic, O., A brief overview of the sock matching problem. [arXiv]
  3. Paulsen, W., The sock problem revisited, The College Mathematics Journal, 52(3) 2021, p. 193-203. [CMJ]