Disabling subnormals in MATLAB

Suppose that we want to compute the following expression, somewhat contrived in complexity for the purpose of example:

y = \frac{1}{s}\sum_{i=1}^{10^7} \frac{s}{2^{50}}

The following MATLAB code implements this formula, and measures the time required to evaluate it:

scale = 1;
y = 0;
for i = 1:10000000
    x = scale;
    for j = 1:50
        x = x * 0.5;
    y = y + x;
y = y / scale;

Now suppose that we execute this same code again, but this time changing the “scale” factor s to a much smaller value: scale = realmin, corresponding to s=2^{-1022}, the smallest positive normalized floating-point number. Inspection of the formula above suggests that the value of y should not depend on the changed value of s (as long as it is non-zero); we may spend most of our time working with much smaller numbers, but the end result should be the same.

And indeed, execution of the modified code confirms that we get exactly the same result… but it takes nearly 20 times longer to do so on my laptop than the original version with s=1. And there are more complex– and less contrived– calculations where the difference in performance is even greater.

The problem is that these smaller numbers are subnormal, small enough in magnitude to require a slightly different encoding than “normal” numbers which make up most of the range of representable floating-point numbers. Arithmetic operations can be significantly more expensive when required to recognize and manipulate this “special case” encoding.

Depending on your application, there may be several approaches to handling this problem:

  1. Rewrite your code to prevent encountering subnormals in the first place. In the above contrived example, this is easy to do: just shift the “scale,” or magnitude, of all values involved in the computation away from the subnormal range (and possibly shifting back only at the end if necessary). This can not only result in faster code, but more accurate results, since subnormal numbers have fewer “significant” mantissa bits in their representation.
  2. Disable subnormal numbers altogether, so that for any floating-point operation, input arguments or output results that would otherwise be treated as subnormal are instead “flushed” to zero.

We have seen above how to manage (1). For (2), the following MEX function does nothing but set the appropriate processor flags to disable subnormals. I have only tested this on Windows 7 with an Intel laptop, compiling in MATLAB R2017b with both Microsoft Visual Studio 2015 as well as the MinGW-w64 MATLAB Add-On (edit: a reader has also tried this on Linux Mint 19 with MATLAB R2018b and GCC 7.3.0):

#include <xmmintrin.h>
#include <pmmintrin.h>
#include "mex.h"

void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
    _mm_setcsr((_mm_getcsr() & ~0x8000) | (0x8000));
    _mm_setcsr((_mm_getcsr() & ~0x0040) | (0x0040));

After running this MEX function in a MATLAB session, re-running the modified calculation above gets all of the speed back… but now at a different cost: instead of the correct value y=10^7/2^{50}, every term in the sum has been flushed to zero, resulting in a final incorrect value of y=0.

If there is any moral to this story, it’s that you’re a test pilot. First, this was a very simple test setup; it’s an interesting question whether any MATLAB built-in functions might reset these flags back to the slower subnormal support, and whether it is feasible in your application to reset them back again, possibly repeatedly as needed. And second, even after any algorithm refactoring to minimize the introduction of subnormals, can your application afford the loss of accuracy resulting from flushing them to zero? MathWorks’ Cleve Moler seems to think the answer is always yes. I think the right answer is, it depends.


Posted in Uncategorized | Leave a comment

A lattice path puzzle

This past week’s Riddler puzzle on FiveThirtyEight asks for the number of different paths of minimum length from a starting intersection of city streets to a destination m blocks east and n blocks north. Put another way, moving on the 2D integer lattice graph, how many paths are there from the origin (0,0) to vertex (m,n) that are of minimum length?

Constraining the paths to minimum length greatly simplifies the problem. So let’s generalize, and instead ask for the number of paths from (0,0) to (m,n) of length k— so that the original problem asks for the particular case k=|m|+|n|, but what if we allow longer paths where we sometimes move in the “wrong” direction away from the destination?

I think this is a nice problem, with an elegant solution only slightly more complex than the original posed in the Riddler column. As a hint, the animation below visualizes the result, where the path length k increases with each frame, showing the probability distribution of the endpoint of a 2D random walk.

Probability distribution of endpoint of 2D lattice random walk, vs. number of steps.

Perhaps as another hint, note the checkerboard pattern to the distribution; only “half” of the vertices are reachable for a particular path length k, and which half is reachable alternates as k increases.

Posted in Uncategorized | Leave a comment

Arbitrary-precision rational arithmetic in C++


This is a follow-up to a post from several years ago describing a C++ implementation of arbitrary-precision unsigned integer arithmetic. This weekend I extended this to also support arbitrary-precision signed integers and rational numbers. Although this started as an educational tool, it now feels a bit more complete, and actually usable for the combinatorics and probability applications of the sort that are frequently discussed here.

I tried to stick to the original objectives of relatively simple and hopefully readable code, with stand-alone, header-only implementation, as freely available in the public domain as legally possible.

The code is available here, as well as on GitHub, in three header files:

  • #include "math_Unsigned.h" defines a math::Unsigned type representing the natural numbers with all of the sensible arithmetic, bitwise, and relational operators, essentially everything except bitwise one’s complement… although more on this shortly.
  • #include "math_Integer.h" defines an Integer type with a sign-and-magnitude implementation in terms of Unsigned, with all corresponding operators, including bitwise operators having two’s complement semantics assuming “infinite sign extension.”
  • #include "math_Rational.h" defines a Rational type implemented in terms of Integer numerator and denominator.

This was a fun exercise; there were interesting challenges in developing each of the three classes. As discussed previously, the unsigned type handles the actual arbitrary-precision representation (implemented as a vector<uint32_t> of digits in base 2^{32}), where division is by far the most complex operation to implement efficiently.

The implementation of the signed integer type is relatively straightforward… except for the bitwise operators. Assuming a sign-and-magnitude representation (using an Unsigned under the hood), it is an interesting exercise to work out how to implement bitwise and, or, xor, and not, so that they have two’s complement semantics even for negative operands. In the process, I had to add an “AND NOT” operator to the original underlying unsigned type (there is actually a built-in operator &^ for this in Go).

With this machinery in place, the rational type is the simplest to implement. The only wrinkle here is that a few additional constructors are needed, since user-defined conversions from the more primitive integral types (e.g., Rational from Integer, Integer from int32_t, etc.) are only implicitly applied “one level deep.”

Example application: Are seven riffle shuffles enough?

To test and demonstrate use of these classes, consider riffle shuffling a standard poker deck of 52 playing cards. How many shuffles are sufficient to “fully randomize” the deck? A popular rule of thumb, attributed to Bayer and Diaconis, is that seven riffle shuffles are recommended. (See a longer list of references here, along with some simpler counting arguments that at least six shuffles are certainly necessary.)

This recommendation is based on analysis of the Gilbert-Shannon-Reeds model of a single riffle shuffle, and of the total variation distance between probability distributions Q^m and U, where Q^m is the distribution of arrangements of the deck after m GSR riffle shuffles, and U is the desired uniform distribution where every arrangement is equally likely. We can compute this total variation distance exactly as a function of the number m of shuffles, as demonstrated in the following example code:

#include "math_Rational.h"
#include <iostream>
using namespace math;

Integer factorial(int n)
    Integer f = 1;
    for (int k = 1; k <= n; ++k)
        f *= k;
    return f;

Integer binomial(int n, int k)
    if (0 <= k && k <= n)
        return factorial(n) / factorial(k) / factorial(n - k);
        return 0;

Integer power(int base, int exp)
    Integer n = 1;
    for (int k = 0; k < exp; ++k)
        n *= base;
    return n;

Integer eulerian(int n, int k)
    Integer r = 0;
    for (int j = 0; j < k + 2; ++j)
        r += (power(-1, j) * binomial(n + 1, j) * power(k + 1 - j, n));
    return r;

Rational total_variation_distance(int cards, int shuffles)
    Rational q = 0;
    for (int r = 1; r <= cards; ++r)
        Rational a = Rational(
            binomial((1 << shuffles) + cards - r, cards),
            Integer(1) << (cards * shuffles)) - Rational(1, factorial(cards));
        q += (eulerian(cards, r - 1) * (a < 0 ? -a : a));
    return q / 2;

int main()
    int cards = 52;
    for (int shuffles = 0; shuffles <= 15; ++shuffles)
        std::cout << shuffles << " " <<
            total_variation_distance(cards, shuffles).to_double() << std::endl;

The following figure shows the results. Total variation distance ranges from a maximum of one (between discrete distributions with disjoint support) to a minimum of zero, in this case corresponding to an exactly uniform distribution of arrangements of the deck.

Total variation distance vs. number of GSR riffle shuffles of a standard 52-card deck.

We can see the sharp threshold behavior, where total variation distance transitions from near one to near zero over just a few shuffles, first dropping below 1/2 at seven shuffles.

Posted in Uncategorized | 3 Comments

Uncertainty in trading passengers for fuel

I had an interesting experience recently while preparing for a flight from Los Angeles to Baltimore. It was a completely full flight– initially, at least– with myself and 174 other passengers who had already boarded the Southwest 737-800, seemingly ready to push back and get on our way.

However, after a delay of several minutes, a flight attendant came on the PA and asked for two– specifically two– volunteers to give up their seat, in exchange for a flight later that afternoon. Two people immediately jumped up, left the airplane, and then we were ready to go… now with two empty seats.

The problem was weight: due to a changing forecast of bad weather, both in Baltimore and en route, we had taken on additional fuel at the last minute (e.g., to allow for diverting to a possibly now-more-distant alternate airport), resulting in the airplane exceeding its maximum takeoff weight. Something had to go, and apparently two passengers and their carry-on bags were a sufficient reduction in weight to allow us to take off.

What I found interesting about this episode was the relative precision of the change– 175 (or even 174) passengers bad, 173 passengers good– compared with the uncertainty in the total weight of the passengers, personal items, and carry-on bags remaining on board. That is, how does the airline know how much we weigh? Since Southwest does not ask individual passengers for their weight, let alone ask them to step on an actual scale prior to boarding, some method of estimation is required.

The FAA provides guidance on how to do this (see reference below): for large-cabin aircraft, the assumed average weight of an adult passenger, his or her clothing, personal items, and a carry-on bag is 190 pounds, with a standard deviation of 47 pounds. The figure below shows the resulting probability distribution of the total weight of all 175 passengers on the initially completely full flight:

Distribution of total weight of 175 passengers on a Southwest Boeing 737-800.

It’s worth noting that the referenced Advisory Circular does provide a more detailed breakdown of assumed average passenger weight, to account for season of travel (5 more pounds of clothing in the winter), gender, children vs. adults, and “nonstandard weight groups” such as sports teams, etc. But for this summer flight, with a relatively even split of male and female passengers, the only simplifying assumption in the above figure is no kids.

The point is that this seems like a significant amount of uncertainty in the actual total weight of the airplane, for less than 400 pounds to be the difference between “Nope, we’re overweight” and “Okay, we’re safe to take off.”


  • Federal Aviation Administration Advisory Circular AC-120-27E, “Aircraft Weight and Balance Control,” 10 June 2005 [PDF]
Posted in Uncategorized | 4 Comments

An urn puzzle

It has been a while again since I last posted a puzzle, so…

You have once again been captured by mathematically-inclined pirates and threatened with walking the plank, unless you can win the following game: some number of black balls and white balls will be placed in an urn. The captain will randomly select and discard a ball from the urn, noting its color, then repeatedly draw and discard additional balls as long as they are the same color. The first drawn ball of a different color will be returned to the urn, and the whole process will be repeated. And repeated, and repeated, until the urn is empty. If the last ball drawn from the urn is black, you must walk the plank; if it is white, you will be set free.

You can choose any positive numbers b and w of black and white balls, respectively, to be placed in the urn at the start. How many of each should you start with to maximize your probability of survival?

Posted in Uncategorized | 2 Comments

On average, we die a decade earlier than expected

“Doctors say he’s got a 50/50 chance of living… though there’s only a 10% chance of that.”

I’ve lately had occasion to contemplate my own mortality. How long should I expect to live? The most recent life table published by the Centers for Disease Control (see the reference at the end of this post) indicates an expected lifespan of 76.5 years for a male. This is based on a model of age at death as a random variable X with the probability density shown in the following figure.

Probability distributions of age at death based on the United States period life table for 2014.

The expected lifespan of 76.5 years is E[X] (using the red curve for males). In other words, if we observed a large number of hypothetical (male) infants born in the reference period 2014– and they continued to experience 2014 mortality rates throughout their lifetimes– then their ages at death would follow the above distribution, with an average of 76.5 years.

However, I have more information now: I have already survived roughly four decades of life. So it makes sense to ask, what is my conditional expected age at death, given that I have already survived to, say, age 40? In other words, what is E[X | X \geq 40]?

This value is 78.8 years; I can expect to live to a greater age now than I thought I would when I was first born. The following figure shows this conditional expected age at death E[X | X \geq x], as well as the corresponding expected additional lifespan E[X-x | X \geq x], as a function of current age x.

Conditional expected age at death and expected additional lifespan, vs. current age.

For another example, suppose that I survive to age 70. Instead of expecting just another 6.5 years, my expected additional lifespan has jumped to 14.5 years.

Which brings us to the interesting observation motivating this post: suppose instead that I die at age 70. I will have missed out on an additional 14.5 years of life on average, compared to the rest of the septuagenarians around me. Put another way, at the moment of my death, I perceive that I am dying 14.5 years earlier than expected.

But this perceived “loss” always occurs, no matter when we die! (In terms of the above figure, the expected value E[X-x | X \geq x] is always positive.) We can average this effect over the entire population, and find that on average males die 12.2 years earlier than expected, and females die 10.8 years earlier than expected.


  1. Arias, E., United States Life Tables 2014, National Vital Statistics Reports, 66(4) August 2017 [PDF]

Following are the probabilities P(\left\lfloor{X}\right\rfloor = x) for the United States 2014 period life table used in this post, derived from the NVSR data in the above reference, extended to maximum age 120 using the methodology described in the technical notes.

Age   P(all)              P(male)             P(female)
  0 0.005831            0.006325            0.005313
  1 0.000367843         0.000391508         0.000343167
  2 0.000246463         0.000276133         0.000216767
  3 0.000182814         0.000206546         0.000157072
  4 0.000156953         0.000183668         0.000129216
  5 0.000141037         0.000160804         0.000120255
  6 0.000125127         0.000142914         0.000106328
  7 0.000112203         0.000128008         0.0000963806
  8 0.000100276         0.000112117         0.0000884231
  9 0.0000913317        0.0000992073        0.0000834481
 10 0.0000883454        0.0000932456        0.0000824477
 11 0.0000952854        0.000103156         0.0000874072
 12 0.000119095         0.000137857         0.000101304
 13 0.000164729         0.000203286         0.000124134
 14 0.000227209         0.000294457         0.000155893
 15 0.000293617         0.000391501         0.000190617
 16 0.000362946         0.000491412         0.000227306
 17 0.000442117         0.000609009         0.000265957
 18 0.000529115         0.000743227         0.000302594
 19 0.000616971         0.000881116         0.000338207
 20 0.00070566          0.00101964          0.000373784
 21 0.000786255         0.00114394          0.000408332
 22 0.000847887         0.0012343           0.000438875
 23 0.000886654         0.00128494          0.000465417
 24 0.000909534         0.00130883          0.000490933
 25 0.000929392         0.0013228           0.0005174
 26 0.000952147         0.00134161          0.000545804
 27 0.000977786         0.00136426          0.00057515
 28 0.0010063           0.00139366          0.000605432
 29 0.00103864          0.00142878          0.00063566
 30 0.00107285          0.0014657           0.000668788
 31 0.00110792          0.00150147          0.000705793
 32 0.00114483          0.0015361           0.000745674
 33 0.00118454          0.00156959          0.000792356
 34 0.00122898          0.00160582          0.000845811
 35 0.00128495          0.00165443          0.000908956
 36 0.00135141          0.00171632          0.000981746
 37 0.00142538          0.00178654          0.00106021
 38 0.00150387          0.0018631           0.00114136
 39 0.00158781          0.00194786          0.00122516
 40 0.00168489          0.00204935          0.00131843
 41 0.00179886          0.00217314          0.00142304
 42 0.00192761          0.00231898          0.00153401
 43 0.00207581          0.00249327          0.00165615
 44 0.00224899          0.00270038          0.00179419
 45 0.00243725          0.00292846          0.00194116
 46 0.00265083          0.0031884           0.00210855
 47 0.0029103           0.00350311          0.00231349
 48 0.00321588          0.00387243          0.0025555
 49 0.0035468           0.00427362          0.00281771
 50 0.00387592          0.00467212          0.00307957
 51 0.00420287          0.00507013          0.00333606
 52 0.00454693          0.00549737          0.00359937
 53 0.00492128          0.00597165          0.0038758
 54 0.00532664          0.00649003          0.00417147
 55 0.00575619          0.00703581          0.00448667
 56 0.00619215          0.00758275          0.00481225
 57 0.00662626          0.00813029          0.00513737
 58 0.00705499          0.00866988          0.00545972
 59 0.00748745          0.00921066          0.00578812
 60 0.00794918          0.00978879          0.0061402
 61 0.0084469           0.010399            0.00653116
 62 0.0089597           0.010994            0.00696556
 63 0.00947691          0.0115436           0.00744933
 64 0.0100035           0.0120603           0.00797895
 65 0.0105466           0.0125691           0.00854801
 66 0.0111425           0.0131347           0.00916566
 67 0.0118165           0.0137895           0.00985298
 68 0.0126025           0.0145881           0.010625
 69 0.0135386           0.0155721           0.0115183
 70 0.014622            0.016711            0.0125556
 71 0.0157853           0.0179169           0.0136863
 72 0.0169733           0.0191484           0.0148415
 73 0.0181664           0.0203416           0.0160437
 74 0.0193544           0.0214907           0.0172763
 75 0.0205581           0.0226235           0.0185559
 76 0.0219039           0.023887            0.0199909
 77 0.0233782           0.0252875           0.0215559
 78 0.0249405           0.0266573           0.0233399
 79 0.0266659           0.0281283           0.0253501
 80 0.0283006           0.0295587           0.0272207
 81 0.0298041           0.0307938           0.0290306
 82 0.0311707           0.0318902           0.0307088
 83 0.0326118           0.0329808           0.0325375
 84 0.0338734           0.0336728           0.0344093
 85 0.0348103           0.0342521           0.0357896
 86 0.0356915           0.0345144           0.0373244
 87 0.036144            0.0342741           0.0384714
 88 0.0361034           0.0334914           0.0391388
 89 0.0355212           0.0321521           0.0392438
 90 0.0343716           0.0302738           0.0387212
 91 0.0326583           0.0279093           0.0375332
 92 0.0304192           0.0251463           0.0356786
 93 0.0277276           0.0221028           0.0331989
 94 0.02469             0.0189177           0.030181
 95 0.0214386           0.0157381           0.0267542
 96 0.0181203           0.0127037           0.0230805
 97 0.0148823           0.00993297          0.0193396
 98 0.011857            0.00751144          0.0157101
 99 0.00914934          0.00548606          0.0123497
100 0.00682791          0.0038652           0.00937893
101 0.0049216           0.00262443          0.0068711
102 0.00342273          0.00171608          0.00484973
103 0.00229461          0.00108015          0.00329442
104 0.00148199          0.000654343         0.00215221
105 0.000921789         0.000381551         0.00135156
106 0.000552114         0.000214241         0.000815772
107 0.00031851          0.000115918         0.00047333
108 0.000177059         0.0000604924        0.000264139
109 0.0000949139        0.0000304833        0.000141878
110 0.0000491106        0.0000148533        0.0000734294
111 0.0000245565        0.00000700891       0.0000366663
112 0.0000118822        0.00000320819       0.0000176914
113 0.00000557214       0.00000142697       0.00000826206
114 0.00000253659       0.000000617876      0.00000374136
115 0.00000112287       0.000000260924      0.00000164594
116 0.0000004842        0.00000010766       0.00000070483
117 0.000000203758      0.0000000434809     0.000000294373
118 0.0000000838243     0.0000000172192     0.000000120143
119 0.0000000337717     0.0000000066978     0.0000000480077
120 0.0000000216956     0.00000000409582    0.0000000304297
Posted in Uncategorized | 1 Comment

How many melodies are there?


The title question came up recently, which I think makes for an interesting combinatorics exercise. The idea is that, given the extent of “borrowing” of past musical ideas by later artists, are we in danger of running out of new music?

We can turn this into a combinatorics problem by focusing solely on pitch and rhythm: in a single bar of music, how many possible melodies are there, consisting of a sequence of notes and rests of varying pitch and duration? The point is that this number is finite; it may be astronomical, but how astronomical?

This is not a new question, and there are plenty of answers out there which make various simplifying assumptions. For example, Vsauce has a video, “Will We Ever Run Out of New Music?,” which in turn refers to a write-up, “How many melodies are there in the universe?,” describing a calculation based on a recurrence relation that effectively requires cutting segments of a bar exactly in halves– undercounting in a way that I suspect was not intentional, implicitly prohibiting even relatively simple melodies like “I’ll Be Home for Christmas.”

But the most common simplifying assumption seems to be a lack of treatment of rests— that is, only counting melodies consisting of a sequence of notes. Rests are an interesting wrinkle that complicates the counting problem: for example, a half note is different from two consecutive quarter notes of the same pitch, but a half rest sounds the same as two consecutive quarter rests. The objective of this post is to add this “expressive power” to the calculation of possible melodies.


Consider a single bar in 4/4 time, consisting of a sequence of whole, half, quarter, eighth, and sixteenth notes and/or rests, with notes chosen from 13 possible pitches, allowing melodies within an octave of the 12-pitch chromatic scale, but also allowing an octave jump (e.g., “Take Me Out to the Ball Game,” “Over the Rainbow,” etc.).

Twelve notes of the chromatic scale, and a thirteenth allowing melodies with an octave jump.

We can encode the choice of a single note of n possible pitches with the following generating function, weighted by duration:

g_n(x) = n(x+x^2+x^4+x^8+x^{16})

and all possible rests with

h(x) = \frac{x}{1-x}

Then the generating function for the number of possible melodies is

f_n(x) = \frac{1+h(x)}{1-g_n(x)(1+h(x))}

Intuitively, a melody consists of zero or one rest, followed by a sequence of zero or more sub-sequences, each consisting of a note followed by zero or one rest. The coefficient [x^{16}]f_{13}(x) is the number of one-bar melodies given the above constraints… but this counts two melodies as distinct even if they only differ in relative pitch. The number of possible melodies consisting of sequences of intervals confined to at most an octave jump is

[x^{16}]f_{13}(x) - [x^{16}]f_{12}(x) + 1

where the +1 accounts for the single “silent melody” of a whole rest. The result is 3,674,912,999,046,911,152, or about 3.7 billion billion possible melodies.


The machinery described above may be easily extended to consider different sets of assumptions: different time signatures, longer or shorter lists of possible notes to choose from, dotted notes and rests, triplets (e.g., the Star Wars theme), etc. The figure below shows the number of possible one-bar melodies for a variety of such assumptions.

As might be expected, dotted notes and/or rests do not affect the “space” of possible melodies nearly as much as note duration: halve the shortest allowable note value, and you very roughly double the number of “bits” in the representation of a melody. If we extend our expressive power to allow 32nd (possibly dotted) notes and rests, then there are 6,150,996,564,625,709,162,647,180,518,925,064,281,006 possible melodies.

Of course, these calculations only address the question of how many melodies are possible— not how many of such melodies are actually appealing to our human ears.

Posted in Uncategorized | Leave a comment