## NCAA tournament brackets revisited

This is becoming an annual exercise.  Two years ago, I wrote about the probability of picking a “perfect” NCAA tournament bracket.  Last year, the topic was the impact of various systems for scoring brackets in office pools.

This year I just want to provide up-to-date historical data for anyone who might want to play with it, including all 32 seasons of the tournament in its current 64-team format, from 1985 to 2016.

(Before continuing, note that the 4 “play-in” games of the so-called “first” round are an abomination, and so I do not consider them here, focusing on the 63 games among the 64-team field.)

First, the data: the following 16×16 matrix indicates the number of regional games (i.e., prior to the Final Four) in which seed i beat seed j.  Note that the round in which each game was played is implied by the seed match-up (e.g., seeds 1 and 16 play in the first round, etc.).

   0  21  13  34  32   7   4  52  59   4   3  19   4   0   0 128
23   0  25   2   0  23  54   2   0  27  12   1   0   0 120   0
8  14   0   2   2  38   7   1   1   9  27   0   0 107   1   0
15   4   3   0  36   2   2   3   2   2   0  23 102   0   0   0
7   3   1  31   0   1   0   0   1   1   0  82  12   0   0   0
2   6  28   1   0   0   4   0   0   4  82   0   0  14   0   0
0  21   5   2   0   3   0   0   0  78   0   0   0   1   2   0
12   3   0   5   2   1   1   0  64   0   0   0   1   0   0   0
5   1   0   0   1   0   0  64   0   0   0   0   1   0   0   0
1  18   4   0   0   2  50   0   0   0   1   0   0   1   5   0
3   1  14   0   0  46   3   0   0   2   0   0   0   5   0   0
0   0   0  12  46   0   0   1   0   0   0   0   8   0   0   0
0   0   0  26   3   0   0   0   0   0   0   3   0   0   0   0
0   0  21   0   0   2   0   0   0   0   0   0   0   0   0   0
0   8   0   0   0   0   1   0   0   0   0   0   0   0   0   0
0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0


The following matrix, in the same format, is for the Final Four games:

  12   6   2   5   1   0   1   1   1   1   0   0   0   0   0   0
4   3   3   1   0   1   0   0   0   0   1   0   0   0   0   0
4   2   0   2   0   0   0   0   0   0   1   0   0   0   0   0
1   0   0   1   1   0   0   0   0   0   0   0   0   0   0   0
0   1   0   0   1   0   0   1   0   0   0   0   0   0   0   0
0   1   0   1   0   0   0   0   0   0   0   0   0   0   0   0
1   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
0   2   0   0   0   0   0   0   0   0   1   0   0   0   0   0
0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0


Finally, the following matrix is for the championship games:

   6   6   1   2   3   1   0   0   0   0   0   0   0   0   0   0
2   0   3   0   0   0   0   0   0   0   0   0   0   0   0   0
0   2   1   0   0   0   0   1   0   0   0   0   0   0   0   0
1   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
1   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
0   0   0   0   0   0   0   1   0   0   0   0   0   0   0   0
1   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0


We can update some of the past analysis using this new data as well.  For example, what is the probability of picking a “perfect” bracket, predicting all 63 games correctly?  As before, Schwertman (see reference below) suggests a couple of simple-but-reasonable models of the probability of seed i beating seed j given by

$p_{i,j} = 1 - p_{j,i} = \frac{1}{2} + k(s_i - s_j)$

where $s_i$ is a measure of the “strength” of seed i, and k is a scaling factor controlling the range of resulting probabilities, in this case chosen so that $p_{1,16}=129/130$, the expected value of the corresponding beta distribution.

One simple strength function is $s_i=-i$, which yields an overall probability of a perfect chalk bracket of about 1 in 188 billion.  A slightly better historical fit is

$s_i = \Phi^{-1}(1 - \frac{4i}{n})$

where $\Phi^{-1}$ is the quantile function of the normal distribution, and $n=351$ is the number of teams in Division I.  In this case, the estimated probability of a perfect bracket is about 1 in 91 billion.  In either case, a perfect bracket is far more likely– about 100 million times more likely– than the usually-quoted 1 in 9.2 quintillion figure that assumes all $2^{63}$ outcomes are equally likely.

References:

1. Schwertman, N., McCready, T., and Howard, L., Probability Models for the NCAA Regional Basketball Tournaments, The American Statistician, 45(1) February 1991, p. 35-38 [PDF]
This entry was posted in Uncategorized. Bookmark the permalink.