Generating Mini-Crosswords


The New York Times publishes “mini-crosswords,” which are crossword puzzles on a relatively small grid (usually 5×5), without any black squares, so that every row and column of the grid must spell a word. The figure below shows an example of a solution to such a puzzle.

How hard is it to create these puzzles? This post is motivated by a recent College Mathematics Journal article (see Reference (1) below) that considers this question, and describes an approach using the Metropolis-Hastings algorithm to randomly sample instances of puzzles.

But instead of just randomly sampling one puzzle at a time, can we actually enumerate all possible puzzles? In particular, my idea was to reduce the problem of finding a crossword puzzle solution to that of finding a (generalized) exact cover with appropriately crafted constraints. This would be handy, because we already have code for solving exact cover problems, using Knuth’s Dancing Links (DLX) algorithm (see here and here for similar past exercises).

Crossword as an exact cover

To state the problem more precisely: given a positive integer n indicating the size of the grid (n=5 in the above example), and a dictionary W of m words each of length n over alphabet \Sigma, we must construct an exact cover problem whose solution corresponds to a placement of letters in all n^2 grid positions such that each row and column spells a word in W.

To do this, we construct a zero-one matrix with 2mn rows, each corresponding to placing one of the m dictionary words either “across” (in one of the n rows of the puzzle) or “down” (in one of the n columns). A solution will consist of a subset of 2n rows of the matrix: n “across” words, one in each row of the puzzle, and n “down” words, one in each column, each pair of which intersect in the appropriate common letter.

To represent the constraints, we initially need 2n^2\cdot|\Sigma| columns in our matrix (where |\Sigma|=26), each indexed by a tuple (puzzle row i, puzzle column j, alphabet letter c, across or down). For a given matrix row– representing placement of a word with letters (w_1, w_2, ..., w_n) in a particular location and orientation in the puzzle grid– we set to one those n\cdot|\Sigma| columns corresponding to the n different (i,j) grid locations where the word will be placed in the puzzle… where for each alphabet letter c, we set the “across” column to one if and only if either

  • the word is “across” and c matches the letter of the word in this location (i.e., c=w_j), or
  • the word is “down” and c does not match the letter of the word in this location (i.e., c \neq w_i).

If neither of these conditions is satisfied, we instead set the “down” column to one. The following Python code produces the number and list of (row, column) pairs of the resulting sparse matrix.

letters = 'abcdefghijklmnopqrstuvwxyz'
m = len(words)
n = len(words[0])
print(m * n * 2 * (n * 26), file=file)
b = [True, False]
for row, ((w, word), k, across) in enumerate(
            product(enumerate(words), range(n), b)):
    for col, (i, j, letter, horiz) in enumerate(
            product(range(n), range(n), letters, b)):
        if ((i if across else j) == k and
            (word[j if across else i] == letter) == (across == horiz)):
            print(row, col, file=file)

However, we’re not quite finished. Although the desired end result is a crossword with distinct words in each row and column, such as the 6×6 solution shown in (a) below, as Howard observes in the CMJ article, there are many valid solutions that use the same word more than once, including the extreme cases of symmetricword squares” such as the one shown in (b) below.

(a) 6×6 mini-crossword, where the word in each row and column is unique. (b) 6×6 “word square,” a symmetric grid with each row and corresonding column containing the same word.

We can eliminate this duplication by adding m “optional” columns to the zero-one matrix, one for each word in the dictionary, and solve the resulting generalized exact cover problem, so that each word may be used at most once in a solution.


All of the source code is available here, as well as on GitHub. My initial test used the 4×4 case discussed in Howard’s paper, with his dictionary of 1826 words. He describes a process for estimating the total number of possible puzzles by repeated sampling using the Markov chain Monte Carlo approach: “We estimate that there are approximately
73,000–74,000 distinct puzzles each with no repeated words.” This is pretty accurate; it turns out that there are exactly 74,339 (each contributing two symmetric pairs of solutions to the generalized exact cover problem, for a total of 148,678 solutions).


  1. Howard, C. Douglas, It’s Puzzling, College Mathematics Journal, 49(4) September 2018, p. 242-249 [DOI]
  2. Knuth, D., Dancing Links, Millenial Perspectives in Computer Science, 2000, p. 187-214 (arXiv)


Posted in Uncategorized | Leave a comment

Disabling subnormals in MATLAB

Suppose that we want to compute the following expression, somewhat contrived in complexity for the purpose of example:

y = \frac{1}{s}\sum_{i=1}^{10^7} \frac{s}{2^{50}}

The following MATLAB code implements this formula, and measures the time required to evaluate it:

scale = 1;
y = 0;
for i = 1:10000000
    x = scale;
    for j = 1:50
        x = x * 0.5;
    y = y + x;
y = y / scale;

Now suppose that we execute this same code again, but this time changing the “scale” factor s to a much smaller value: scale = realmin, corresponding to s=2^{-1022}, the smallest positive normalized floating-point number. Inspection of the formula above suggests that the value of y should not depend on the changed value of s (as long as it is non-zero); we may spend most of our time working with much smaller numbers, but the end result should be the same.

And indeed, execution of the modified code confirms that we get exactly the same result… but it takes nearly 20 times longer to do so on my laptop than the original version with s=1. And there are more complex– and less contrived– calculations where the difference in performance is even greater.

The problem is that these smaller numbers are subnormal, small enough in magnitude to require a slightly different encoding than “normal” numbers which make up most of the range of representable floating-point numbers. Arithmetic operations can be significantly more expensive when required to recognize and manipulate this “special case” encoding.

Depending on your application, there may be several approaches to handling this problem:

  1. Rewrite your code to prevent encountering subnormals in the first place. In the above contrived example, this is easy to do: just shift the “scale,” or magnitude, of all values involved in the computation away from the subnormal range (and possibly shifting back only at the end if necessary). This can not only result in faster code, but more accurate results, since subnormal numbers have fewer “significant” mantissa bits in their representation.
  2. Disable subnormal numbers altogether, so that for any floating-point operation, input arguments or output results that would otherwise be treated as subnormal are instead “flushed” to zero.

We have seen above how to manage (1). For (2), the following MEX function does nothing but set the appropriate processor flags to disable subnormals. I have only tested this on Windows 7 with an Intel laptop, compiling in MATLAB R2017b with both Microsoft Visual Studio 2015 as well as the MinGW-w64 MATLAB Add-On (edit: a reader has also tried this on Linux Mint 19 with MATLAB R2018b and GCC 7.3.0):

#include <xmmintrin.h>
#include <pmmintrin.h>
#include "mex.h"

void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
    _mm_setcsr((_mm_getcsr() & ~0x8000) | (0x8000));
    _mm_setcsr((_mm_getcsr() & ~0x0040) | (0x0040));

After running this MEX function in a MATLAB session, re-running the modified calculation above gets all of the speed back… but now at a different cost: instead of the correct value y=10^7/2^{50}, every term in the sum has been flushed to zero, resulting in a final incorrect value of y=0.

If there is any moral to this story, it’s that you’re a test pilot. First, this was a very simple test setup; it’s an interesting question whether any MATLAB built-in functions might reset these flags back to the slower subnormal support, and whether it is feasible in your application to reset them back again, possibly repeatedly as needed. And second, even after any algorithm refactoring to minimize the introduction of subnormals, can your application afford the loss of accuracy resulting from flushing them to zero? MathWorks’ Cleve Moler seems to think the answer is always yes. I think the right answer is, it depends.


Posted in Uncategorized | Leave a comment

A lattice path puzzle

This past week’s Riddler puzzle on FiveThirtyEight asks for the number of different paths of minimum length from a starting intersection of city streets to a destination m blocks east and n blocks north. Put another way, moving on the 2D integer lattice graph, how many paths are there from the origin (0,0) to vertex (m,n) that are of minimum length?

Constraining the paths to minimum length greatly simplifies the problem. So let’s generalize, and instead ask for the number of paths from (0,0) to (m,n) of length k— so that the original problem asks for the particular case k=|m|+|n|, but what if we allow longer paths where we sometimes move in the “wrong” direction away from the destination?

I think this is a nice problem, with an elegant solution only slightly more complex than the original posed in the Riddler column. As a hint, the animation below visualizes the result, where the path length k increases with each frame, showing the probability distribution of the endpoint of a 2D random walk.

Probability distribution of endpoint of 2D lattice random walk, vs. number of steps.

Perhaps as another hint, note the checkerboard pattern to the distribution; only “half” of the vertices are reachable for a particular path length k, and which half is reachable alternates as k increases.

Posted in Uncategorized | Leave a comment

Arbitrary-precision rational arithmetic in C++


This is a follow-up to a post from several years ago describing a C++ implementation of arbitrary-precision unsigned integer arithmetic. This weekend I extended this to also support arbitrary-precision signed integers and rational numbers. Although this started as an educational tool, it now feels a bit more complete, and actually usable for the combinatorics and probability applications of the sort that are frequently discussed here.

I tried to stick to the original objectives of relatively simple and hopefully readable code, with stand-alone, header-only implementation, as freely available in the public domain as legally possible.

The code is available here, as well as on GitHub, in three header files:

  • #include "math_Unsigned.h" defines a math::Unsigned type representing the natural numbers with all of the sensible arithmetic, bitwise, and relational operators, essentially everything except bitwise one’s complement… although more on this shortly.
  • #include "math_Integer.h" defines an Integer type with a sign-and-magnitude implementation in terms of Unsigned, with all corresponding operators, including bitwise operators having two’s complement semantics assuming “infinite sign extension.”
  • #include "math_Rational.h" defines a Rational type implemented in terms of Integer numerator and denominator.

This was a fun exercise; there were interesting challenges in developing each of the three classes. As discussed previously, the unsigned type handles the actual arbitrary-precision representation (implemented as a vector<uint32_t> of digits in base 2^{32}), where division is by far the most complex operation to implement efficiently.

The implementation of the signed integer type is relatively straightforward… except for the bitwise operators. Assuming a sign-and-magnitude representation (using an Unsigned under the hood), it is an interesting exercise to work out how to implement bitwise and, or, xor, and not, so that they have two’s complement semantics even for negative operands. In the process, I had to add an “AND NOT” operator to the original underlying unsigned type (there is actually a built-in operator &^ for this in Go).

With this machinery in place, the rational type is the simplest to implement. The only wrinkle here is that a few additional constructors are needed, since user-defined conversions from the more primitive integral types (e.g., Rational from Integer, Integer from int32_t, etc.) are only implicitly applied “one level deep.”

Example application: Are seven riffle shuffles enough?

To test and demonstrate use of these classes, consider riffle shuffling a standard poker deck of 52 playing cards. How many shuffles are sufficient to “fully randomize” the deck? A popular rule of thumb, attributed to Bayer and Diaconis, is that seven riffle shuffles are recommended. (See a longer list of references here, along with some simpler counting arguments that at least six shuffles are certainly necessary.)

This recommendation is based on analysis of the Gilbert-Shannon-Reeds model of a single riffle shuffle, and of the total variation distance between probability distributions Q^m and U, where Q^m is the distribution of arrangements of the deck after m GSR riffle shuffles, and U is the desired uniform distribution where every arrangement is equally likely. We can compute this total variation distance exactly as a function of the number m of shuffles, as demonstrated in the following example code:

#include "math_Rational.h"
#include <iostream>
using namespace math;

Integer factorial(int n)
    Integer f = 1;
    for (int k = 1; k <= n; ++k)
        f *= k;
    return f;

Integer binomial(int n, int k)
    if (0 <= k && k <= n)
        return factorial(n) / factorial(k) / factorial(n - k);
        return 0;

Integer power(int base, int exp)
    Integer n = 1;
    for (int k = 0; k < exp; ++k)
        n *= base;
    return n;

Integer eulerian(int n, int k)
    Integer r = 0;
    for (int j = 0; j < k + 2; ++j)
        r += (power(-1, j) * binomial(n + 1, j) * power(k + 1 - j, n));
    return r;

Rational total_variation_distance(int cards, int shuffles)
    Rational q = 0;
    for (int r = 1; r <= cards; ++r)
        Rational a = Rational(
            binomial((1 << shuffles) + cards - r, cards),
            Integer(1) << (cards * shuffles)) - Rational(1, factorial(cards));
        q += (eulerian(cards, r - 1) * (a < 0 ? -a : a));
    return q / 2;

int main()
    int cards = 52;
    for (int shuffles = 0; shuffles <= 15; ++shuffles)
        std::cout << shuffles << " " <<
            total_variation_distance(cards, shuffles).to_double() << std::endl;

The following figure shows the results. Total variation distance ranges from a maximum of one (between discrete distributions with disjoint support) to a minimum of zero, in this case corresponding to an exactly uniform distribution of arrangements of the deck.

Total variation distance vs. number of GSR riffle shuffles of a standard 52-card deck.

We can see the sharp threshold behavior, where total variation distance transitions from near one to near zero over just a few shuffles, first dropping below 1/2 at seven shuffles.

Posted in Uncategorized | 3 Comments

Uncertainty in trading passengers for fuel

I had an interesting experience recently while preparing for a flight from Los Angeles to Baltimore. It was a completely full flight– initially, at least– with myself and 174 other passengers who had already boarded the Southwest 737-800, seemingly ready to push back and get on our way.

However, after a delay of several minutes, a flight attendant came on the PA and asked for two– specifically two– volunteers to give up their seat, in exchange for a flight later that afternoon. Two people immediately jumped up, left the airplane, and then we were ready to go… now with two empty seats.

The problem was weight: due to a changing forecast of bad weather, both in Baltimore and en route, we had taken on additional fuel at the last minute (e.g., to allow for diverting to a possibly now-more-distant alternate airport), resulting in the airplane exceeding its maximum takeoff weight. Something had to go, and apparently two passengers and their carry-on bags were a sufficient reduction in weight to allow us to take off.

What I found interesting about this episode was the relative precision of the change– 175 (or even 174) passengers bad, 173 passengers good– compared with the uncertainty in the total weight of the passengers, personal items, and carry-on bags remaining on board. That is, how does the airline know how much we weigh? Since Southwest does not ask individual passengers for their weight, let alone ask them to step on an actual scale prior to boarding, some method of estimation is required.

The FAA provides guidance on how to do this (see reference below): for large-cabin aircraft, the assumed average weight of an adult passenger, his or her clothing, personal items, and a carry-on bag is 190 pounds, with a standard deviation of 47 pounds. The figure below shows the resulting probability distribution of the total weight of all 175 passengers on the initially completely full flight:

Distribution of total weight of 175 passengers on a Southwest Boeing 737-800.

It’s worth noting that the referenced Advisory Circular does provide a more detailed breakdown of assumed average passenger weight, to account for season of travel (5 more pounds of clothing in the winter), gender, children vs. adults, and “nonstandard weight groups” such as sports teams, etc. But for this summer flight, with a relatively even split of male and female passengers, the only simplifying assumption in the above figure is no kids.

The point is that this seems like a significant amount of uncertainty in the actual total weight of the airplane, for less than 400 pounds to be the difference between “Nope, we’re overweight” and “Okay, we’re safe to take off.”


  • Federal Aviation Administration Advisory Circular AC-120-27E, “Aircraft Weight and Balance Control,” 10 June 2005 [PDF]
Posted in Uncategorized | 4 Comments

An urn puzzle

It has been a while again since I last posted a puzzle, so…

You have once again been captured by mathematically-inclined pirates and threatened with walking the plank, unless you can win the following game: some number of black balls and white balls will be placed in an urn. The captain will randomly select and discard a ball from the urn, noting its color, then repeatedly draw and discard additional balls as long as they are the same color. The first drawn ball of a different color will be returned to the urn, and the whole process will be repeated. And repeated, and repeated, until the urn is empty. If the last ball drawn from the urn is black, you must walk the plank; if it is white, you will be set free.

You can choose any positive numbers b and w of black and white balls, respectively, to be placed in the urn at the start. How many of each should you start with to maximize your probability of survival?

Posted in Uncategorized | 2 Comments

On average, we die a decade earlier than expected

“Doctors say he’s got a 50/50 chance of living… though there’s only a 10% chance of that.”

I’ve lately had occasion to contemplate my own mortality. How long should I expect to live? The most recent life table published by the Centers for Disease Control (see the reference at the end of this post) indicates an expected lifespan of 76.5 years for a male. This is based on a model of age at death as a random variable X with the probability density shown in the following figure.

Probability distributions of age at death based on the United States period life table for 2014.

The expected lifespan of 76.5 years is E[X] (using the red curve for males). In other words, if we observed a large number of hypothetical (male) infants born in the reference period 2014– and they continued to experience 2014 mortality rates throughout their lifetimes– then their ages at death would follow the above distribution, with an average of 76.5 years.

However, I have more information now: I have already survived roughly four decades of life. So it makes sense to ask, what is my conditional expected age at death, given that I have already survived to, say, age 40? In other words, what is E[X | X \geq 40]?

This value is 78.8 years; I can expect to live to a greater age now than I thought I would when I was first born. The following figure shows this conditional expected age at death E[X | X \geq x], as well as the corresponding expected additional lifespan E[X-x | X \geq x], as a function of current age x.

Conditional expected age at death and expected additional lifespan, vs. current age.

For another example, suppose that I survive to age 70. Instead of expecting just another 6.5 years, my expected additional lifespan has jumped to 14.5 years.

Which brings us to the interesting observation motivating this post: suppose instead that I die at age 70. I will have missed out on an additional 14.5 years of life on average, compared to the rest of the septuagenarians around me. Put another way, at the moment of my death, I perceive that I am dying 14.5 years earlier than expected.

But this perceived “loss” always occurs, no matter when we die! (In terms of the above figure, the expected value E[X-x | X \geq x] is always positive.) We can average this effect over the entire population, and find that on average males die 12.2 years earlier than expected, and females die 10.8 years earlier than expected.


  1. Arias, E., United States Life Tables 2014, National Vital Statistics Reports, 66(4) August 2017 [PDF]

Following are the probabilities P(\left\lfloor{X}\right\rfloor = x) for the United States 2014 period life table used in this post, derived from the NVSR data in the above reference, extended to maximum age 120 using the methodology described in the technical notes.

Age   P(all)              P(male)             P(female)
  0 0.005831            0.006325            0.005313
  1 0.000367843         0.000391508         0.000343167
  2 0.000246463         0.000276133         0.000216767
  3 0.000182814         0.000206546         0.000157072
  4 0.000156953         0.000183668         0.000129216
  5 0.000141037         0.000160804         0.000120255
  6 0.000125127         0.000142914         0.000106328
  7 0.000112203         0.000128008         0.0000963806
  8 0.000100276         0.000112117         0.0000884231
  9 0.0000913317        0.0000992073        0.0000834481
 10 0.0000883454        0.0000932456        0.0000824477
 11 0.0000952854        0.000103156         0.0000874072
 12 0.000119095         0.000137857         0.000101304
 13 0.000164729         0.000203286         0.000124134
 14 0.000227209         0.000294457         0.000155893
 15 0.000293617         0.000391501         0.000190617
 16 0.000362946         0.000491412         0.000227306
 17 0.000442117         0.000609009         0.000265957
 18 0.000529115         0.000743227         0.000302594
 19 0.000616971         0.000881116         0.000338207
 20 0.00070566          0.00101964          0.000373784
 21 0.000786255         0.00114394          0.000408332
 22 0.000847887         0.0012343           0.000438875
 23 0.000886654         0.00128494          0.000465417
 24 0.000909534         0.00130883          0.000490933
 25 0.000929392         0.0013228           0.0005174
 26 0.000952147         0.00134161          0.000545804
 27 0.000977786         0.00136426          0.00057515
 28 0.0010063           0.00139366          0.000605432
 29 0.00103864          0.00142878          0.00063566
 30 0.00107285          0.0014657           0.000668788
 31 0.00110792          0.00150147          0.000705793
 32 0.00114483          0.0015361           0.000745674
 33 0.00118454          0.00156959          0.000792356
 34 0.00122898          0.00160582          0.000845811
 35 0.00128495          0.00165443          0.000908956
 36 0.00135141          0.00171632          0.000981746
 37 0.00142538          0.00178654          0.00106021
 38 0.00150387          0.0018631           0.00114136
 39 0.00158781          0.00194786          0.00122516
 40 0.00168489          0.00204935          0.00131843
 41 0.00179886          0.00217314          0.00142304
 42 0.00192761          0.00231898          0.00153401
 43 0.00207581          0.00249327          0.00165615
 44 0.00224899          0.00270038          0.00179419
 45 0.00243725          0.00292846          0.00194116
 46 0.00265083          0.0031884           0.00210855
 47 0.0029103           0.00350311          0.00231349
 48 0.00321588          0.00387243          0.0025555
 49 0.0035468           0.00427362          0.00281771
 50 0.00387592          0.00467212          0.00307957
 51 0.00420287          0.00507013          0.00333606
 52 0.00454693          0.00549737          0.00359937
 53 0.00492128          0.00597165          0.0038758
 54 0.00532664          0.00649003          0.00417147
 55 0.00575619          0.00703581          0.00448667
 56 0.00619215          0.00758275          0.00481225
 57 0.00662626          0.00813029          0.00513737
 58 0.00705499          0.00866988          0.00545972
 59 0.00748745          0.00921066          0.00578812
 60 0.00794918          0.00978879          0.0061402
 61 0.0084469           0.010399            0.00653116
 62 0.0089597           0.010994            0.00696556
 63 0.00947691          0.0115436           0.00744933
 64 0.0100035           0.0120603           0.00797895
 65 0.0105466           0.0125691           0.00854801
 66 0.0111425           0.0131347           0.00916566
 67 0.0118165           0.0137895           0.00985298
 68 0.0126025           0.0145881           0.010625
 69 0.0135386           0.0155721           0.0115183
 70 0.014622            0.016711            0.0125556
 71 0.0157853           0.0179169           0.0136863
 72 0.0169733           0.0191484           0.0148415
 73 0.0181664           0.0203416           0.0160437
 74 0.0193544           0.0214907           0.0172763
 75 0.0205581           0.0226235           0.0185559
 76 0.0219039           0.023887            0.0199909
 77 0.0233782           0.0252875           0.0215559
 78 0.0249405           0.0266573           0.0233399
 79 0.0266659           0.0281283           0.0253501
 80 0.0283006           0.0295587           0.0272207
 81 0.0298041           0.0307938           0.0290306
 82 0.0311707           0.0318902           0.0307088
 83 0.0326118           0.0329808           0.0325375
 84 0.0338734           0.0336728           0.0344093
 85 0.0348103           0.0342521           0.0357896
 86 0.0356915           0.0345144           0.0373244
 87 0.036144            0.0342741           0.0384714
 88 0.0361034           0.0334914           0.0391388
 89 0.0355212           0.0321521           0.0392438
 90 0.0343716           0.0302738           0.0387212
 91 0.0326583           0.0279093           0.0375332
 92 0.0304192           0.0251463           0.0356786
 93 0.0277276           0.0221028           0.0331989
 94 0.02469             0.0189177           0.030181
 95 0.0214386           0.0157381           0.0267542
 96 0.0181203           0.0127037           0.0230805
 97 0.0148823           0.00993297          0.0193396
 98 0.011857            0.00751144          0.0157101
 99 0.00914934          0.00548606          0.0123497
100 0.00682791          0.0038652           0.00937893
101 0.0049216           0.00262443          0.0068711
102 0.00342273          0.00171608          0.00484973
103 0.00229461          0.00108015          0.00329442
104 0.00148199          0.000654343         0.00215221
105 0.000921789         0.000381551         0.00135156
106 0.000552114         0.000214241         0.000815772
107 0.00031851          0.000115918         0.00047333
108 0.000177059         0.0000604924        0.000264139
109 0.0000949139        0.0000304833        0.000141878
110 0.0000491106        0.0000148533        0.0000734294
111 0.0000245565        0.00000700891       0.0000366663
112 0.0000118822        0.00000320819       0.0000176914
113 0.00000557214       0.00000142697       0.00000826206
114 0.00000253659       0.000000617876      0.00000374136
115 0.00000112287       0.000000260924      0.00000164594
116 0.0000004842        0.00000010766       0.00000070483
117 0.000000203758      0.0000000434809     0.000000294373
118 0.0000000838243     0.0000000172192     0.000000120143
119 0.0000000337717     0.0000000066978     0.0000000480077
120 0.0000000216956     0.00000000409582    0.0000000304297
Posted in Uncategorized | 1 Comment