NCAA tournament brackets revisited

This is becoming an annual exercise.  Two years ago, I wrote about the probability of picking a “perfect” NCAA tournament bracket.  Last year, the topic was the impact of various systems for scoring brackets in office pools.

This year I just want to provide up-to-date historical data for anyone who might want to play with it, including all 32 seasons of the tournament in its current 64-team format, from 1985 to 2016.

(Before continuing, note that the 4 “play-in” games of the so-called “first” round are an abomination, and so I do not consider them here, focusing on the 63 games among the 64-team field.)

First, the data: the following 16×16 matrix indicates the number of regional games (i.e., prior to the Final Four) in which seed i beat seed j.  Note that the round in which each game was played is implied by the seed match-up (e.g., seeds 1 and 16 play in the first round, etc.).

   0  21  13  34  32   7   4  52  59   4   3  19   4   0   0 128
  23   0  25   2   0  23  54   2   0  27  12   1   0   0 120   0
   8  14   0   2   2  38   7   1   1   9  27   0   0 107   1   0
  15   4   3   0  36   2   2   3   2   2   0  23 102   0   0   0
   7   3   1  31   0   1   0   0   1   1   0  82  12   0   0   0
   2   6  28   1   0   0   4   0   0   4  82   0   0  14   0   0
   0  21   5   2   0   3   0   0   0  78   0   0   0   1   2   0
  12   3   0   5   2   1   1   0  64   0   0   0   1   0   0   0
   5   1   0   0   1   0   0  64   0   0   0   0   1   0   0   0
   1  18   4   0   0   2  50   0   0   0   1   0   0   1   5   0
   3   1  14   0   0  46   3   0   0   2   0   0   0   5   0   0
   0   0   0  12  46   0   0   1   0   0   0   0   8   0   0   0
   0   0   0  26   3   0   0   0   0   0   0   3   0   0   0   0
   0   0  21   0   0   2   0   0   0   0   0   0   0   0   0   0
   0   8   0   0   0   0   1   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0

The following matrix, in the same format, is for the Final Four games:

  12   6   2   5   1   0   1   1   1   1   0   0   0   0   0   0
   4   3   3   1   0   1   0   0   0   0   1   0   0   0   0   0
   4   2   0   2   0   0   0   0   0   0   1   0   0   0   0   0
   1   0   0   1   1   0   0   0   0   0   0   0   0   0   0   0
   0   1   0   0   1   0   0   1   0   0   0   0   0   0   0   0
   0   1   0   1   0   0   0   0   0   0   0   0   0   0   0   0
   1   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   2   0   0   0   0   0   0   0   0   1   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0

Finally, the following matrix is for the championship games:

   6   6   1   2   3   1   0   0   0   0   0   0   0   0   0   0
   2   0   3   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   2   1   0   0   0   0   1   0   0   0   0   0   0   0   0
   1   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   1   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   1   0   0   0   0   0   0   0   0
   1   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0

We can update some of the past analysis using this new data as well.  For example, what is the probability of picking a “perfect” bracket, predicting all 63 games correctly?  As before, Schwertman (see reference below) suggests a couple of simple-but-reasonable models of the probability of seed i beating seed j given by

p_{i,j} = 1 - p_{j,i} = \frac{1}{2} + k(s_i - s_j)

where s_i is a measure of the “strength” of seed i, and k is a scaling factor controlling the range of resulting probabilities, in this case chosen so that p_{1,16}=129/130, the expected value of the corresponding beta distribution.

One simple strength function is s_i=-i, which yields an overall probability of a perfect chalk bracket of about 1 in 188 billion.  A slightly better historical fit is

s_i = \Phi^{-1}(1 - \frac{4i}{n}) 

where \Phi^{-1} is the quantile function of the normal distribution, and n=351 is the number of teams in Division I.  In this case, the estimated probability of a perfect bracket is about 1 in 91 billion.  In either case, a perfect bracket is far more likely– about 100 million times more likely– than the usually-quoted 1 in 9.2 quintillion figure that assumes all 2^{63} outcomes are equally likely.

References:

    1. Schwertman, N., McCready, T., and Howard, L., Probability Models for the NCAA Regional Basketball Tournaments, The American Statistician, 45(1) February 1991, p. 35-38 [PDF]
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s