## Random number generation in Simulink

Last week a co-worker observed confusing behavior from the uniform random number generator in Simulink.  The output of the generator “did not seem random.”  This is what she saw:

Output of Simulink random number generator, one draw for each initial seed 1 to 100.

This is certainly uniform, but I would agree that this does not appear to be “random.”  (Note: there is a problem with the setup here, though, which we will get to shortly.  Can you see it?)  I wanted to understand this behavior better, and in the process I found some interesting things about the algorithm used in the generator.

I don’t use Simulink.  But I do use MATLAB, which supports– and documents– several different algorithms for generating random numbers.  This is why I was surprised to find that the corresponding documentation on Simulink’s uniform random number generator provides no detail, let alone configurable options, regarding the algorithm used.

Fortunately, we can reverse-engineer the algorithm without too much detective work.  First, the regularity observed in the figure above strongly suggests a linear congruential generator, and even more fortunately, MATLAB provides such a generator that, after some experimenting, exactly reproduces the behavior in Simulink.  That generator is the ‘mcg16807′ or ‘v4′ legacy generator that was the default through MATLAB version 4.0.  In other words, the latest version of Simulink today provides only a single generator… that MATLAB left behind over 15 years ago.  Huh.

The ‘mcg16807′ generator is essentially the MINSTD generator, in which the state of the generator $x_n$ traverses the elements of the (cyclic) multiplicative group of integers modulo a prime:

$x_{n+1} = a x_n \mod m$

where in this case $a=7^5=16807$ and $m=2^{31}-1$.  Integer values of the state in the range $[1, m-2]$ are transformed to random real values $u_n$ output from the generator in the unit open interval $(0, 1)$ according to

$u_n = \frac{x_n}{m}$

So far, none of this is new or interesting.  The interesting wrinkle is in how the generator is seeded.  In all implementations I have seen, the seed $s$ is simply mapped identically to the initial state $x_0$ of the generator.  But in the MATLAB/Simulink implementation, seeding appears to be slightly more complicated.

How can we tell?  Because if we were to simply set the generator’s initial state equal to the seed, as is usually the case, then the figure above would have looked even worse:

Single random draws from the MINSTD generator for each state/seed from 1 to 100.

These 100 “random” values simply increase linearly in really small steps of $1/m$, or about 0.0000000005.  This vividly highlights the potential danger, alluded to earlier, of using this approach to seeding to control random number generation in, say, multiple iterations of a Monte Carlo simulation.  (And yet, I frequently see analysis that involves exactly this practice, just with different generators.  Food for thought: how do we get away with it?  Or are we getting away with it?)

Getting back to the problem of determining the more complex seeding algorithm in the Simulink generator: we have a “gray box” random number generator that, when seeded with a value $s$, yields a first random output $u$, where

$a f(s) \equiv round(u m) (\mod m)$

By solving this congruence for a few appropriately selected seeds, we get a collection of $(s, f(s))$ pairs; the problem is to describe the function $f$.  After some trial and error, it turns out that this function is reasonably simple… but with a strange twist.  The basic idea is that a given 32-bit integer seed $s$ is transformed into a 31-bit state $f(s)$ by exchanging the high and low-order 16-bit “halves” of the seed… with a few necessary special cases to ensure that the resulting state is in the valid range $[1, m-2]$.

Following is MATLAB code that reproduces the behavior of the Simulink generator:

classdef MCG16807 < handle
properties
state;
end
methods
function self = MCG16807(s)
self.seed(s);
end
function [] = seed(self, s)
switch s
case 0
self.state = 1144108930;
case 2^32 - 1
self.state = 2^31 - 2;
otherwise
self.state = min(rem(s * 2^16 + floor(s / 2^16) + ...
bitand(s, 2^15), 2^31), 2^31 - 2);
end
end
function r = rand(self)
self.state = rem(7^5 * self.state, 2^31 - 1);
r = self.state / (2^31 - 1);
end
end
end


I suppose swapping word halves like this is understandable, since it is a simple way to mitigate somewhat the undesirable effect above, where often-used “nearby” seeds yield very “nearby” sub-sequences of random draws.  But in an attempt to be helpful, I think it ends up merely hiding the still-present potential danger inherent in this mis-use of a sub-standard generator.

The “twist,” though, is that weird extra addition of bitand(s, 2^15).  That is, to compute the state from the seed, we do not just swap word halves, we also add (arithmetically, not bit-wise) the masked 15th bit of the seed.  But why?

The reasoning behind this oddly-specific-yet-seemingly-arbitrary extra step is not clear to me.  My guess, though– which is admittedly cynically biased– is that it has the feel of a rather half-hearted attempt to further “mix” the transformation from seed to state… in a way that rarely has a useful effect.

## Young Earth Creationism

“Someone is wrong on the internet.” xkcd

I suffer from this disorder, although I try hard to manage the symptoms.

This post is a follow-up to an exchange with my good friend nomasir over at Freedom at Bethsaida.  The subject matter there tends to focus on “the nature of Christian interaction and participation in government and public policy.”  So what does this have to do with science?  In this case, the discussion involved homeschooling and its advantages over the public school system, which led– via my prompting– to the teaching of “Young Earth Creationism,” a belief that our planet is only thousands of years old, as opposed to billions.

As usual, before continuing, I recommend reading the original post first, before I attempt to cloud your judgment.

I have been dismissive of religiously motivated science education here before.  However, in this case there are a couple of ideas presented that I think deserve more careful inspection.  Okay, maybe not so much the ideas of creationism itself, but some interesting views on the practice and philosophy of science in general.  In what follows, I will try to relate my comments back to the relevant sections of the original post.

Science != Democracy

Reference is made to a Gallup poll from last year, suggesting that “creationism has broad-based support.”  Of those surveyed:

• 46% said “God created humans in present form.”
• 32% said “Humans evolved, with God guiding.”
• 15% said “Humans evolved, but God had no part in the process.”

(Note that this poll does not actually address the original issue of age of the earth.  No matter; the ideas discussed here still apply.)

It is not clear whether presentation of these results was intended to be considered evidence for any one of these views being correct.  I hope not– and in fact I don’t think it was.  But as long as we are here, let’s investigate further, because I think it is misleading to characterize this as “broad-based support,” for two reasons.

First, this poll was of Americans, who (1) make up less than 5% of the world’s population, and (2) are rather disappointingly unique in their desire to cling to creationism.  Check out these wider surveys of other countries from 2006 and 2009, the latter of which also makes the important point of attempting to relate belief with knowledge of the theory in question.

Which leads to the second reason, namely that these numbers change dramatically if, rather than sampling from the entire U.S. population, we instead focus on scientists in a relevant field.  Or even more loosely, if we simply focus on people with, say, a graduate-level education, in any field.  Wikipedia provides an interesting mathematical exercise leading up to the conclusion that “the roughly 150 biologist Darwin Dissenters represent about 0.0157% of the US biologists that existed in 1999.”

My point here is merely to emphasize that we do not vote on science.  Science is, in my opinion, not democratic, but Bayesian, where the theory that prevails is not the one most commonly held, but the one best supported by the weight of evidence.

Evidences of Young Earth

I tried to track down information on each of the 15 specific “observable processes” mentioned in the original post.  It is possible that I missed the interesting one, because I admit that I gave up after the fourth, “the receding moon,” one of several ideas proposed by one Thomas G. Barnes as evidence for a young earth.  It is here, in the actual technical “meat” of the arguments, that I find myself most fascinated by creationism.  Because these are not, in fact, subtle, difficult, or new arguments.  If they were, I could perhaps understand the adherence to them, since there would be no clear consensus.

At any rate, it is understandable that a political/religious forum like Freedom is not the place to climb into an even moderately dense mathematical or otherwise technical discussion.  But here, on the other hand, it is more than welcome.  So if there is any interest in discussing any of these arguments in more detail, I would be happy to participate.

It’s All Religion

From the original post: “It is a matter of faith for [scientists] too.  We reject out of hand the notion that Christians blindly hold to a Biblical explanation with zero evidence, and “scientists” heroically and unbiasedly progress toward understanding truth.”

This is the idea that I found most interesting.  The problem of bias in science has been discussed here before.  In one sense, I can see nomasir’s point: particularly here in the U.S., there can be significant pressure in the academic community to publish… well, just to publish.  For example, there is a dearth of negative results that are seen as less interesting or less “sensational;” there is financial pressure to get or keep grant funding; and the government source of that funding can possibly lead to political pressure to skew results that impact policy decisions.

But I don’t think these are the kinds of personal bias that nomasir is talking about.  The suggestion seems to be that scientists have just as personal a stake in, say, the true age of the earth, as do creationists.  Again, from the original post: “If one can definitively prove the young-earth model of creation, then the evolutionists are stuck with the plain reality that the only explanation for our existence is a Creator.  And such an explanation dramatically changes their outlook on life.”

I see several problems with this statement.  First, it is a false dilemma.  Second, it also suggests a misleading view of how science works.  Science is not in the business of proving anything.  Science is in the business of disproving theories.  What I have called “current scientific consensus” is nothing more than those theories that have so far most successfully survived all attempts at discrediting them.

But even aside from this nit-picking, I disagree with the contention that scientists somehow “need” the earth to be old, that they “need” humans to have evolved.  Or at least, I disagree with the contention that I “need” any particular theory to be true.  Indeed, I must confess to understanding the attraction of at least some tenets of Christianity.  How comforting it would be for me to know that loved ones that I have lost are not gone, that they have not simply ceased to be.  How relaxing it would be if it were no longer useful to strive to understand how the world works, because it is “not for me to know.”

I hear repeatedly from my creationist friends the complaint that scientists are hypocritical in their rejection of creationism as a possible theory.  Here again, the confusion seems to be with the “Bayesian” nature of science.  That is, I think this complaint perverts the basically correct idea that “every theory is possibly wrong,” by trying to turn it into the absolutely incorrect idea that “every theory is equally likely to be true.”

So let me be as clear as possible: it is possible that the earth is less than one million years old.  (I use this cutoff as a neutral middle ground where presumably no one cares much about the exact length of time.)  But given the weight of all available observable evidence, the estimate of the probability that this is the case is so vanishingly small that to hold to it as a working hypothesis would limit or even prohibit useful progress in day-to-day scientific endeavor.

Posted in Uncategorized | 2 Comments

## Analysis of Farkle

Introduction

Games involving dice seem to be a theme so far this year.  Last weekend, my nephew introduced me to Farkle, an easy-to-learn game requiring just six dice and paper-and-pencil for scoring.  As usual, rather than simply enjoying playing the game, I wondered about optimal playing strategy and how to evaluate it.

Rules of the Game

The basic rules are pretty simple, although there are many different scoring variants (more on this later).  Two or more players take turns accumulating points by rolling the dice as follows.  A player starts by rolling all 6 dice; he then sets aside one or more of the dice worth some positive number of points, and either:

1. Ends his turn, adding all points accumulated so far to his running total; or
2. Re-rolls the remaining dice and sets aside one or more of them, at which point he may again either end his turn or re-roll the new, smaller remainder.

If at any point a roll contains no scoring dice, then the turn ends and the player loses all points accumulated during that turn (a “Farkle”).  Play continues until some player reaches 10,000 or more points, at which point each other player gets one more turn to try to beat his score.  The highest score wins the game.

How are the dice scored?  I will focus on the version of the game by Patch Products that we played (see here for a PDF rule sheet):

• Two triplets (e.g., 1-1-1-2-2-2) are worth 2500 points.
• Four of a kind and a pair (e.g., 1-1-1-1-2-2) are worth 1500 points.
• A straight (1-2-3-4-5-6) is worth 1500 points.
• Three pairs (e.g., 1-1-2-2-3-3) are worth 1500 points.
• Six of a kind is worth 3000 points.
• Five of a kind is worth 2000 points.
• Four of a kind is worth 1000 points.
• Three 1s are worth 300 points.
• Three 2s are worth 200 points.
• Three 3s are worth 300 points.
• Three 4s are worth 400 points.
• Three 5s are worth 500 points.
• Three 6s are worth 600 points.
• Individual 1s are worth 100 points each (unless they are used in any of the above scoring).
• Individual 5s are worth 50 points each (unless they are used in any of the above scoring).

Finally, there is one more interesting rule: if a player is able to score all 6 dice– not necessarily in a single roll– he may continue accumulating points in his turn by re-rolling all of the dice again, setting aside scoring dice, re-rolling the remainder, etc.  As we will see, this rule is what makes Farkle a bit more challenging to analyze.

Computing Playing Strategy

[Edit: Matt Busche provides an excellent analysis of the game.  (Actually, he focuses on a similar game called Zilch, of which Farkle is a simpler variant.)  He even has an online strategy generator, customizable for many different scoring options.  But it was missing a couple of things that I was interested in: (1) it did not appear to support the 2500-point "two triplets" scoring option, and (2) beyond just the various expected values, I was also interested in the full probability distribution of possible scores in a turn.]

To evaluate Farkle strategy, we need just two functions.  First, the following function (in Mathematica) implements the scoring rules, computing the score of a given subset of dice.  Note the use of -infinity to simplify the convention that including any non-scoring dice always yields a score of zero:

score[dice_] := Total[Tally[dice] /. {
{{_, 3}, {_, 3}} -> {2500},
{{_, 4}, {_, 2}} | {{_, 2}, {_, 4}} -> {1500},
{{_, 1}, {_, 1}, {_, 1}, {_, 1}, {_, 1}, {_, 1}} -> {1500},
{{_, 2}, {_, 2}, {_, 2}} -> {1500},
{x_, 6} :> {3000, 3000, 3000, 3000, 3000, 3000}[[x]],
{x_, 5} :> {2000, 2000, 2000, 2000, 2000, 2000}[[x]],
{x_, 4} :> {1000, 1000, 1000, 1000, 1000, 1000}[[x]],
{x_, 3} :> {300, 200, 300, 400, 500, 600}[[x]],
{x_, 2} :> {200, -Infinity, -Infinity, -Infinity, 100, -Infinity}[[x]],
{x_, 1} :> {100, -Infinity, -Infinity, -Infinity, 50, -Infinity}[[x]]
}] /. -Infinity -> 0


Next, we compute the expected value (i.e., the expected final accumulated score for the turn) of rolling n dice with a current accumulated score of s points… and subsequently playing to maximize the expected final score.  This expected value is a probability-weighted sum over all possible rolls of the dice, where for each roll we find the best of all possible “moves,” either stopping with the points accumulated so far, or (recursively) re-rolling some dice.

value[{s_, 0}] := value[{s, 0}] = If[s > $maxScore, s, value[{s, 6}]] value[{s_, n_}] := value[{s, n}] = Module[{roll, weight, moves}, Total[Map[ ( {roll, weight} = #; (* Consider all scoring dice and resulting game states. *) moves = DeleteCases[ Map[{s + score[#], n - Length[#]} &, Subsets[roll]], {s, _} ]; (* Pick best move, weighted by probability of this roll. *) weight*If[moves === {}, 0, Max[Select[First /@ moves, # >=$minScore &], value /@ moves]
]
) &,
Tally[Sort /@ Tuples[Range[6], n]]
]]/6^n
]


There are a few interesting things going on here.  First, Mathematica’s automatic memoization means we can solve the problem “top-down,” so that the code reads essentially like the equations (or the cocktail-napkin pseudocode) describing the problem.

Second, note the $minScore and$maxScore parameters.  The minimum score merely allows handling the situation at the start of the game, where a player must score at least 500 points to “get on the board.”  The maximum score is more interesting, since it addresses the following problem with the recursive algorithm: what is the base case?

If scoring all 6 dice ended a player’s turn– or even if it allowed just a single additional “bonus” roll– then the problem is easy.  But since a player can repeatedly re-roll all 6 dice, as long as he continues to score all of them, there is in principle no limit on the points he can accumulate in a single turn.

However, since we are maximizing expected score, there is some sufficiently large score beyond which a player is no longer willing to risk rolling again and losing it all with a Farkle.  At worst, we can find this maximum score by trial and error; it turns out that this maximum score is 16,750 points!  It is an interesting open question whether we can compute this maximum directly, ahead of time, possibly with some constraints on the scoring options.

Results

What does the resulting strategy look like?  First, the following table simply shows the probability of a Farkle when rolling the given number of dice.  (Note that these probabilities are independent of any particular playing strategy.)

• Roll 6 dice: P(Farkle) = 5/216 = 0.02315
• Roll 5 dice: P(Farkle) = 25/324 = 0.07716
• Roll 4 dice: P(Farkle) = 17/108 = 0.15741
• Roll 3 dice: P(Farkle) = 5/18 = 0.27778
• Roll 2 dice: P(Farkle) = 4/9 = 0.44444
• Roll 1 die: P(Farkle) = 2/3 = 0.66667

Also, before describing the strategy itself, the following plot shows the effect of the strategy, as the distribution of possible scores for a turn.  Note that there is a very long tail of the distribution that is not shown, extending to the maximum possible score of 19,750 points… but the tail not shown makes up less than 0.8% of the total probability.  Interestingly, the overall probability of a Farkle on any given turn is nearly 0.2.

Probability distribution of number of points in a turn. There is a long tail extending to 19,750 points, but scores exceeding 3000 points occur with probability less than 0.8%.

An important note on “optimality”: I am focusing here on the strategy that maximizes expected score, with $minScore set to zero. This corresponds roughly to most of the turns “in the middle” of a game, i.e., after the first turn with the 500-point minimum, but also before the last turn(s) near the end of the game, where, for example, the real objective is not to maximize expected score, but to maximize the probability of exceeding the other player’s score. The value function above is easily modified to compute such a strategy, for a specific end-game difference between scores. Finally, the following table “encodes” the playing strategy, showing the expected final score assuming that the player rolls the given number of dice with the given current score. For example, rolling all 6 dice at the start of a turn, the overall expected score is 542.063 points. Score 6 5 4 3 2 1 ======================================================================= 0 542.063 300.378 218.304 182.650 172.474 200.898 50 582.354 333.255 235.843 193.380 183.621 214.664 100 623.033 370.062 261.926 205.487 194.933 228.841 150 664.951 407.619 295.428 224.308 206.296 243.327 200 708.096 445.427 331.827 254.316 220.526 257.929 250 751.866 485.066 368.557 290.158 242.646 272.546 300 795.711 526.678 405.708 326.018 269.822 287.214 350 839.563 568.757 445.481 361.888 297.045 301.998 400 883.719 610.847 487.460 397.764 324.302 316.933 450 928.271 652.946 529.448 433.642 351.573 332.004 500 973.325 695.049 571.445 469.529 378.846 347.142 550 1018.699 737.546 613.450 505.431 406.120 362.294 600 1064.152 781.081 655.457 541.355 433.416 377.448 650 1109.613 825.039 697.465 577.291 460.769 392.606 700 1155.075 868.998 739.473 613.230 488.175 407.892 750 1200.562 912.959 781.484 649.168 515.604 423.388 800 1246.787 956.921 823.496 685.108 543.036 438.996 850 1293.543 1000.945 865.509 721.050 570.468 454.626 900 1340.431 1046.017 907.523 756.992 597.901 470.256 950 1387.322 1092.079 949.539 792.936 625.334 485.887 1000 1434.215 1138.144 991.557 828.882 652.767 501.519 1050 1481.109 1184.212 1033.577 864.830 680.201 517.150 1100 1528.003 1230.282 1075.603 900.780 707.637 532.784 1150 1574.897 1276.352 1117.633 936.735 735.074 548.423 1200 1621.805 1322.422 1159.663 972.698 762.516 564.067 1250 1668.733 1368.493 1201.694 1008.667 789.976 579.712 1300 1715.669 1414.564 1243.725 1044.637 817.460 595.383 1350 1762.605 1460.638 1285.757 1080.607 844.955 611.139 1400 1809.691 1506.712 1327.790 1116.577 872.451 626.957 1450 1857.144 1552.787 1369.826 1152.549 899.946 642.775 1500 1904.598 1598.863 1411.863 1188.521 927.442 658.595 1550 1952.055 1644.940 1453.900 1224.496 954.938 674.415 1600 1999.514 1691.020 1495.940 1260.472 982.433 690.235 1650 2046.975 1737.104 1537.982 1296.449 1009.929 706.056 1700 2094.438 1783.191 1580.027 1332.428 1037.425 721.877 1750 2141.900 1829.279 1622.078 1368.410 1064.921 737.698 1800 2189.363 1875.369 1664.133 1404.398 1092.417 753.519 1850 2236.826 1921.459 1706.191 1440.396 1119.920 769.341 1900 2284.290 1967.549 1748.248 1476.401 1147.444 785.162 1950 2331.755 2013.639 1790.306 1512.408 1174.996 801.023 2000 2379.220 2059.732 1832.364 1548.416 1202.564 816.976 2050 2426.916 2105.825 1874.424 1584.423 1230.134 833.002 2100 2474.942 2151.919 1916.485 1620.432 1257.704 849.046 2150 2523.073 2198.014 1958.548 1656.441 1285.275 865.090 2200 2571.204 2244.110 2000.612 1692.453 1312.845 881.135 2250 2619.337 2290.208 2042.677 1728.465 1340.416 897.179 2300 2667.471 2336.307 2084.744 1764.479 1367.986 913.224 2350 2715.605 2382.408 2126.814 1800.495 1395.557 929.269 2400 2763.740 2428.511 2168.888 1836.512 1423.127 945.314 2450 2811.875 2474.615 2210.964 1872.536 1450.698 961.359 2500 2860.009 2520.718 2253.043 1908.568 1478.274 977.404 2550 2908.144 2566.822 2295.122 1944.607 1505.870 993.449 2600 2956.279 2612.926 2337.201 1980.648 1533.491 1009.528 2650 3004.413 2659.029 2379.280 2016.689 1561.126 1025.691 2700 3052.753 2705.133 2421.359 2052.730 1588.763 1041.921 2750 3101.394 2751.236 2463.438 2088.771 1616.400 1058.166 2800 3150.130 2797.340 2505.517 2124.811 1644.038 1074.411 2850 3198.866 2843.444 2547.595 2160.852 1671.675 1090.657 2900 3247.602 2889.547 2589.674 2196.893 1699.312 1106.902 2950 3296.338 2935.651 2631.753 2232.934 1726.949 1123.147 3000 3345.074 2981.755 2673.832 2268.975 1754.587 1139.392 3050 3393.809 3027.858 2715.911 2305.016 1782.224 1155.638 3100 3442.545 3073.962 2757.990 2341.057 1809.861 1171.883 3150 3491.281 3120.065 2800.069 2377.098 1837.499 1188.128 3200 3540.017 3166.169 2842.148 2413.139 1865.136 1204.374 3250 3588.753 3212.273 2884.227 2449.179 1892.773 1220.619 3300 3637.489 3258.376 2926.306 2485.220 1920.411 1236.864 3350 3686.225 3304.480 2968.385 2521.261 1948.048 1253.110 3400 3734.961 3350.583 3010.463 2557.302 1975.685 1269.355 3450 3783.697 3396.687 3052.542 2593.343 2003.323 1285.600 3500 3832.433 3442.791 3094.621 2629.384 2030.960 1301.846 3550 3881.169 3488.894 3136.700 2665.425 2058.597 1318.091 3600 3929.905 3534.998 3178.779 2701.466 2086.235 1334.336 3650 3978.641 3581.102 3220.858 2737.507 2113.872 1350.582 3700 4027.377 3627.205 3262.937 2773.547 2141.509 1366.827 3750 4076.113 3673.309 3305.016 2809.588 2169.147 1383.072 3800 4124.849 3719.412 3347.095 2845.629 2196.784 1399.318 3850 4173.585 3765.516 3389.174 2881.670 2224.421 1415.563 3900 4222.321 3811.620 3431.253 2917.711 2252.059 1431.808 3950 4271.057 3857.723 3473.332 2953.752 2279.696 1448.054 4000 4319.793 3903.827 3515.410 2989.793 2307.333 1464.299 4050 4368.529 3949.930 3557.489 3025.834 2334.971 1480.544 4100 4417.265 3996.034 3599.568 3061.875 2362.608 1496.789 4150 4466.000 4042.138 3641.647 3097.915 2390.245 1513.035 4200 4514.736 4088.241 3683.726 3133.956 2417.883 1529.280 4250 4563.472 4134.345 3725.805 3169.997 2445.520 1545.525 4300 4612.208 4180.448 3767.884 3206.038 2473.157 1561.771 4350 4660.944 4226.552 3809.963 3242.079 2500.795 1578.016 4400 4709.680 4272.656 3852.042 3278.120 2528.432 1594.261 4450 4758.416 4318.759 3894.121 3314.161 2556.069 1610.507 4500 4807.152 4364.863 3936.200 3350.202 2583.707 1626.752 4550 4855.888 4410.967 3978.278 3386.243 2611.344 1642.997 4600 4904.624 4457.070 4020.357 3422.283 2638.981 1659.243 4650 4953.360 4503.174 4062.436 3458.324 2666.619 1675.488 4700 5002.096 4549.277 4104.515 3494.365 2694.256 1691.733 4750 5050.832 4595.381 4146.594 3530.406 2721.893 1707.979 4800 5099.568 4641.485 4188.673 3566.447 2749.531 1724.224 4850 5148.304 4687.588 4230.752 3602.488 2777.168 1740.469 4900 5197.040 4733.692 4272.831 3638.529 2804.805 1756.715 4950 5245.776 4779.795 4314.910 3674.570 2832.443 1772.960 5000 5294.512 4825.899 4356.989 3710.610 2860.080 1789.205 5050 5343.248 4872.003 4399.068 3746.651 2887.717 1805.451 5100 5391.984 4918.106 4441.146 3782.692 2915.355 1821.696 5150 5440.720 4964.210 4483.225 3818.733 2942.992 1837.941 5200 5489.456 5010.313 4525.304 3854.774 2970.629 1854.186 5250 5538.191 5056.417 4567.383 3890.815 2998.267 1870.432 5300 5586.927 5102.521 4609.462 3926.856 3025.904 1886.677 5350 5635.663 5148.624 4651.541 3962.897 3053.541 1902.922 5400 5684.399 5194.728 4693.620 3998.938 3081.179 1919.168 5450 5733.135 5240.832 4735.699 4034.978 3108.816 1935.413 5500 5781.871 5286.935 4777.778 4071.019 3136.453 1951.658 5550 5830.607 5333.039 4819.857 4107.060 3164.091 1967.904 5600 5879.343 5379.142 4861.936 4143.101 3191.728 1984.149 5650 5928.079 5425.246 4904.015 4179.142 3219.365 2000.394 5700 5976.815 5471.350 4946.093 4215.183 3247.003 2016.640 5750 6025.551 5517.453 4988.172 4251.224 3274.640 2032.885 5800 6074.287 5563.557 5030.251 4287.265 3302.277 2049.130 5850 6123.023 5609.660 5072.330 4323.306 3329.915 2065.376 5900 6171.759 5655.764 5114.409 4359.346 3357.552 2081.621 5950 6220.495 5701.868 5156.488 4395.387 3385.189 2097.866 6000 6269.231 5747.971 5198.567 4431.428 3412.827 2114.112 6050 6317.967 5794.075 5240.646 4467.469 3440.464 2130.357 6100 6366.703 5840.178 5282.725 4503.510 3468.101 2146.602 6150 6415.439 5886.282 5324.804 4539.551 3495.739 2162.848 6200 6464.175 5932.386 5366.883 4575.592 3523.376 2179.093 6250 6512.911 5978.489 5408.961 4611.633 3551.013 2195.338 6300 6561.647 6024.593 5451.040 4647.674 3578.650 2211.583 6350 6610.383 6070.697 5493.119 4683.714 3606.288 2227.829 6400 6659.118 6116.800 5535.198 4719.755 3633.925 2244.074 6450 6707.854 6162.904 5577.277 4755.796 3661.562 2260.319 6500 6756.590 6209.007 5619.356 4791.837 3689.200 2276.565 6550 6805.326 6255.111 5661.435 4827.878 3716.837 2292.810 6600 6854.062 6301.215 5703.514 4863.919 3744.474 2309.055 6650 6902.798 6347.318 5745.593 4899.960 3772.112 2325.301 6700 6951.534 6393.422 5787.672 4936.001 3799.749 2341.546 6750 7000.270 6439.525 5829.751 4972.042 3827.386 2357.791 6800 7049.006 6485.629 5871.829 5008.082 3855.024 2374.037 6850 7097.742 6531.733 5913.908 5044.123 3882.661 2390.282 6900 7146.478 6577.836 5955.987 5080.164 3910.298 2406.527 6950 7195.214 6623.940 5998.066 5116.205 3937.936 2422.773 7000 7243.950 6670.044 6040.145 5152.246 3965.573 2439.018 7050 7292.686 6716.147 6082.224 5188.287 3993.210 2455.263 7100 7341.422 6762.251 6124.303 5224.328 4020.848 2471.509 7150 7390.158 6808.354 6166.382 5260.369 4048.485 2487.754 7200 7438.894 6854.458 6208.461 5296.409 4076.122 2503.999 7250 7487.630 6900.562 6250.540 5332.450 4103.760 2520.245 7300 7536.366 6946.665 6292.619 5368.491 4131.397 2536.490 7350 7585.102 6992.769 6334.698 5404.532 4159.034 2552.735 7400 7633.838 7038.872 6376.776 5440.573 4186.672 2568.981 7450 7682.574 7084.976 6418.855 5476.614 4214.309 2585.226 7500 7731.309 7131.080 6460.934 5512.655 4241.946 2601.471 7550 7780.045 7177.183 6503.013 5548.696 4269.584 2617.716 7600 7828.781 7223.287 6545.092 5584.737 4297.221 2633.962 7650 7877.517 7269.390 6587.171 5620.777 4324.858 2650.207 7700 7926.253 7315.494 6629.250 5656.818 4352.496 2666.452 7750 7974.989 7361.598 6671.329 5692.859 4380.133 2682.698 7800 8023.725 7407.701 6713.408 5728.900 4407.770 2698.943 7850 8072.461 7453.805 6755.487 5764.941 4435.408 2715.188 7900 8121.197 7499.909 6797.566 5800.982 4463.045 2731.434 7950 8169.933 7546.012 6839.644 5837.023 4490.682 2747.679 8000 8218.669 7592.116 6881.723 5873.064 4518.320 2763.924 8050 8267.405 7638.219 6923.802 5909.105 4545.957 2780.170 8100 8316.141 7684.323 6965.881 5945.145 4573.594 2796.415 8150 8364.877 7730.427 7007.960 5981.186 4601.232 2812.660 8200 8413.613 7776.530 7050.039 6017.227 4628.869 2828.906 8250 8462.349 7822.634 7092.118 6053.268 4656.506 2845.151 8300 8511.085 7868.737 7134.197 6089.309 4684.144 2861.396 8350 8559.821 7914.841 7176.276 6125.350 4711.781 2877.642 8400 8608.557 7960.945 7218.355 6161.391 4739.418 2893.887 8450 8657.293 8007.048 7260.434 6197.432 4767.056 2910.132 8500 8706.029 8053.152 7302.512 6233.473 4794.693 2926.378 8550 8754.765 8099.255 7344.591 6269.513 4822.330 2942.623 8600 8803.501 8145.359 7386.670 6305.554 4849.968 2958.868 8650 8852.236 8191.463 7428.749 6341.595 4877.605 2975.113 8700 8900.972 8237.566 7470.828 6377.636 4905.242 2991.359 8750 8949.708 8283.670 7512.907 6413.677 4932.880 3007.604 8800 8998.444 8329.774 7554.986 6449.718 4960.517 3023.849 8850 9047.180 8375.877 7597.065 6485.759 4988.154 3040.095 8900 9095.916 8421.981 7639.144 6521.800 5015.792 3056.340 8950 9144.652 8468.084 7681.223 6557.841 5043.429 3072.585 9000 9193.388 8514.188 7723.302 6593.881 5071.066 3088.831 9050 9242.124 8560.292 7765.381 6629.922 5098.704 3105.076 9100 9290.860 8606.395 7807.459 6665.963 5126.341 3121.321 9150 9339.596 8652.499 7849.538 6702.004 5153.978 3137.567 9200 9388.332 8698.602 7891.617 6738.045 5181.616 3153.812 9250 9437.068 8744.706 7933.696 6774.086 5209.253 3170.057 9300 9485.804 8790.810 7975.775 6810.127 5236.890 3186.303 9350 9534.540 8836.913 8017.854 6846.168 5264.528 3202.548 9400 9583.276 8883.017 8059.933 6882.208 5292.165 3218.793 9450 9632.012 8929.120 8102.012 6918.249 5319.802 3235.039 9500 9680.748 8975.224 8144.091 6954.290 5347.440 3251.284 9550 9729.484 9021.328 8186.170 6990.331 5375.077 3267.529 9600 9778.220 9067.431 8228.249 7026.372 5402.714 3283.775 9650 9826.956 9113.535 8270.327 7062.413 5430.352 3300.020 9700 9875.692 9159.639 8312.406 7098.454 5457.989 3316.265 9750 9924.428 9205.742 8354.485 7134.495 5485.626 3332.511 9800 9973.164 9251.846 8396.564 7170.536 5513.263 3348.756 9850 10021.900 9297.949 8438.643 7206.576 5540.901 3365.001 9900 10070.635 9344.053 8480.722 7242.617 5568.538 3381.246 9950 10119.371 9390.157 8522.801 7278.658 5596.175 3397.492 10000 10168.107 9436.260 8564.880 7314.699 5623.813 3413.737 10050 10216.843 9482.364 8606.959 7350.740 5651.450 3429.982 10100 10265.579 9528.467 8649.038 7386.781 5679.087 3446.228 10150 10314.315 9574.571 8691.117 7422.822 5706.725 3462.473 10200 10363.051 9620.675 8733.196 7458.863 5734.362 3478.718 10250 10411.787 9666.778 8775.274 7494.904 5761.999 3494.964 10300 10460.523 9712.882 8817.353 7530.944 5789.637 3511.209 10350 10509.259 9758.986 8859.432 7566.985 5817.274 3527.454 10400 10557.995 9805.089 8901.511 7603.026 5844.911 3543.700 10450 10606.731 9851.193 8943.590 7639.067 5872.549 3559.945 10500 10655.467 9897.296 8985.669 7675.108 5900.186 3576.190 10550 10704.203 9943.400 9027.748 7711.149 5927.823 3592.436 10600 10752.939 9989.504 9069.827 7747.190 5955.461 3608.681 10650 10801.675 10035.607 9111.906 7783.231 5983.098 3624.926 10700 10850.411 10081.711 9153.985 7819.272 6010.735 3641.172 10750 10899.147 10127.814 9196.064 7855.312 6038.373 3657.417 10800 10947.883 10173.918 9238.142 7891.353 6066.010 3673.662 10850 10996.619 10220.022 9280.221 7927.394 6093.647 3689.908 10900 11045.355 10266.125 9322.300 7963.435 6121.285 3706.153 10950 11094.091 10312.229 9364.379 7999.476 6148.922 3722.398 11000 11142.827 10358.333 9406.458 8035.517 6176.559 3738.644 11050 11191.563 10404.436 9448.537 8071.558 6204.197 3754.889 11100 11240.299 10450.540 9490.616 8107.599 6231.834 3771.134 11150 11289.035 10496.643 9532.695 8143.640 6259.471 3787.380 11200 11337.771 10542.747 9574.774 8179.681 6287.109 3803.625 11250 11386.507 10588.851 9616.853 8215.721 6314.746 3819.870 11300 11435.243 10634.954 9658.932 8251.762 6342.384 3836.116 11350 11483.979 10681.058 9701.011 8287.803 6370.021 3852.361 11400 11532.715 10727.161 9743.090 8323.844 6397.658 3868.607 11450 11581.452 10773.265 9785.168 8359.885 6425.296 3884.852 11500 11630.188 10819.369 9827.247 8395.926 6452.933 3901.097 11550 11678.924 10865.472 9869.326 8431.967 6480.570 3917.343 11600 11727.660 10911.576 9911.405 8468.008 6508.208 3933.588 11650 11776.396 10957.680 9953.484 8504.049 6535.845 3949.833 11700 11825.132 11003.783 9995.563 8540.089 6563.482 3966.079 11750 11873.868 11049.887 10037.642 8576.130 6591.120 3982.324 11800 11922.604 11095.991 10079.721 8612.171 6618.757 3998.570 11850 11971.341 11142.094 10121.800 8648.212 6646.394 4014.815 11900 12020.077 11188.198 10163.879 8684.253 6674.032 4031.060 11950 12068.813 11234.301 10205.958 8720.294 6701.669 4047.306 12000 12117.549 11280.405 10248.037 8756.335 6729.306 4063.551 12050 12166.286 11326.509 10290.116 8792.376 6756.944 4079.797 12100 12215.022 11372.612 10332.195 8828.417 6784.581 4096.042 12150 12263.758 11418.716 10374.273 8864.458 6812.219 4112.287 12200 12312.494 11464.820 10416.352 8900.499 6839.856 4128.533 12250 12361.230 11510.923 10458.431 8936.540 6867.493 4144.778 12300 12409.967 11557.027 10500.510 8972.580 6895.131 4161.024 12350 12458.703 11603.131 10542.589 9008.621 6922.768 4177.269 12400 12507.440 11649.234 10584.668 9044.662 6950.406 4193.515 12450 12556.176 11695.338 10626.747 9080.703 6978.043 4209.760 12500 12604.913 11741.442 10668.826 9116.744 7005.680 4226.006 12550 12653.650 11787.545 10710.905 9152.785 7033.318 4242.251 12600 12702.386 11833.649 10752.984 9188.826 7060.955 4258.497 12650 12751.123 11879.753 10795.063 9224.867 7088.593 4274.743 12700 12799.859 11925.856 10837.142 9260.908 7116.230 4290.988 12750 12848.596 11971.960 10879.221 9296.949 7143.868 4307.234 12800 12897.333 12018.064 10921.300 9332.990 7171.505 4323.480 12850 12946.070 12064.168 10963.379 9369.031 7199.143 4339.726 12900 12994.808 12110.271 11005.458 9405.072 7226.780 4355.971 12950 13043.546 12156.375 11047.537 9441.113 7254.418 4372.217 13000 13092.283 12202.479 11089.616 9477.154 7282.055 4388.463 13050 13141.021 12248.582 11131.695 9513.195 7309.693 4404.709 13100 13189.758 12294.686 11173.774 9549.236 7337.330 4420.955 13150 13238.496 12340.790 11215.853 9585.277 7364.968 4437.201 13200 13287.234 12386.894 11257.932 9621.318 7392.605 4453.447 13250 13335.971 12432.997 11300.011 9657.359 7420.243 4469.693 13300 13384.709 12479.101 11342.090 9693.400 7447.880 4485.939 13350 13433.447 12525.205 11384.170 9729.441 7475.518 4502.185 13400 13482.185 12571.309 11426.249 9765.482 7503.155 4518.431 13450 13530.923 12617.413 11468.328 9801.523 7530.793 4534.677 13500 13579.661 12663.517 11510.407 9837.564 7558.431 4550.923 13550 13628.399 12709.621 11552.486 9873.605 7586.068 4567.169 13600 13677.137 12755.725 11594.565 9909.647 7613.706 4583.415 13650 13725.875 12801.829 11636.645 9945.688 7641.344 4599.661 13700 13774.613 12847.933 11678.724 9981.729 7668.982 4615.907 13750 13823.351 12894.037 11720.803 10017.770 7696.620 4632.154 13800 13872.091 12940.141 11762.883 10053.812 7724.258 4648.402 13850 13920.834 12986.245 11804.962 10089.853 7751.896 4664.649 13900 13969.577 13032.349 11847.041 10125.895 7779.534 4680.897 13950 14018.320 13078.453 11889.121 10161.936 7807.172 4697.145 14000 14067.063 13124.557 11931.200 10197.977 7834.810 4713.392 14050 14115.806 13170.662 11973.280 10234.019 7862.448 4729.640 14100 14164.549 13216.766 12015.359 10270.060 7890.087 4745.888 14150 14213.292 13262.870 12057.439 10306.102 7917.725 4762.136 14200 14262.035 13308.974 12099.518 10342.144 7945.364 4778.384 14250 14310.778 13355.079 12141.598 10378.186 7973.003 4794.633 14300 14359.524 13401.183 12183.678 10414.227 8000.642 4810.883 14350 14408.275 13447.288 12225.758 10450.269 8028.281 4827.134 14400 14457.026 13493.393 12267.838 10486.311 8055.920 4843.384 14450 14505.777 13539.497 12309.918 10522.353 8083.559 4859.634 14500 14554.528 13585.602 12351.997 10558.395 8111.198 4875.885 14550 14603.279 13631.707 12394.077 10594.436 8138.837 4892.135 14600 14652.030 13677.812 12436.157 10630.478 8166.476 4908.386 14650 14700.781 13723.917 12478.238 10666.520 8194.115 4924.636 14700 14749.533 13770.022 12520.318 10702.562 8221.755 4940.887 14750 14798.286 13816.128 12562.399 10738.604 8249.394 4957.139 14800 14847.039 13862.234 12604.480 10774.647 8277.033 4973.390 14850 14895.793 13908.340 12646.561 10810.689 8304.673 4989.641 14900 14944.546 13954.447 12688.642 10846.732 8332.312 5005.892 14950 14993.300 14000.554 12730.723 10882.775 8359.951 5022.144 15000 15042.054 14046.662 12772.805 10918.818 8387.591 5038.395 15050 15090.808 14092.769 12814.888 10954.862 8415.230 5054.646 15100 15139.562 14138.876 12856.970 10990.907 8442.870 5070.898 15150 15188.316 14184.984 12899.053 11026.953 8470.513 5087.149 15200 15237.070 14231.091 12941.136 11062.999 8498.159 5103.404 15250 15285.823 14277.198 12983.218 11099.045 8525.806 5119.671 15300 15334.600 14323.306 13025.301 11135.091 8553.454 5135.947 15350 15383.428 14369.413 13067.384 11171.137 8581.101 5152.223 15400 15432.255 14415.521 13109.467 11207.183 8608.749 5168.499 15450 15481.083 14461.628 13151.549 11243.229 8636.396 5184.775 15500 15529.910 14507.736 13193.632 11279.275 8664.044 5201.051 15550 15578.738 14553.843 13235.715 11315.322 8691.692 5217.327 15600 15627.566 14599.951 13277.798 11351.368 8719.339 5233.603 15650 15676.394 14646.059 13319.881 11387.414 8746.987 5249.880 15700 15725.225 14692.167 13361.964 11423.460 8774.635 5266.157 15750 15774.055 14738.280 13404.047 11459.506 8802.283 5282.434 15800 15822.886 14784.395 13446.131 11495.553 8829.931 5298.711 15850 15871.717 14830.510 13488.220 11531.599 8857.579 5314.988 15900 15920.547 14876.625 13530.308 11567.646 8885.227 5331.265 15950 15969.379 14922.740 13572.397 11603.692 8912.875 5347.543 16000 16018.211 14968.856 13614.485 11639.739 8940.524 5363.821 16050 16067.045 15014.974 13656.574 11675.785 8968.172 5380.099 16100 16115.879 15061.094 13698.663 11711.832 8995.821 5396.377 16150 16164.713 15107.217 13740.756 11747.879 9023.469 5412.656 16200 16213.548 15153.340 13782.853 11783.927 9051.118 5428.935 16250 16262.385 15199.465 13824.952 11819.979 9078.767 5445.215 16300 16311.224 15245.593 13867.053 11856.033 9106.416 5461.495 16350 16360.064 15291.724 13909.157 11892.091 9134.066 5477.776 16400 16408.906 15337.858 13951.264 11928.150 9161.715 5494.056 16450 16457.748 15383.997 13993.376 11964.213 9189.364 5510.337 16500 16506.591 15430.138 14035.496 12000.279 9217.013 5526.618 16550 16555.433 15476.280 14077.623 12036.357 9244.662 5542.899 16600 16604.276 15522.422 14119.753 12072.454 9272.321 5559.180 16650 16653.119 15568.564 14161.883 12108.565 9300.022 5575.461 16700 16701.961 15614.706 14204.012 12144.676 9327.778 5591.801 16750 16750.804 15660.847 14246.142 12180.787 9355.556 5608.333 16800 16799.646 15706.989 14288.272 12216.898 9383.333 5625.000  References: Busche, Matthew, Maximizing Expected Scores in the Game of Zilch. [HTML] Posted in Uncategorized | Leave a comment ## Coincidences in random shuffling revisited This is just a follow-up from my last post, collecting and cleaning up my thoughts on the behavior of “coincidences” in the random shuffled order of songs in a playlist (i.e., consecutive songs by the same artist). But first, it is useful to begin with the slightly different and more common problem mentioned by Chris in the comments. Consider the following game: You and a friend each have a brand new deck of cards. You shuffle your deck, and your friend shuffles his. Then each of you deals the top card of your respective decks face up next to each other on the table. If the two cards match, then you win one dollar. Continue dealing together, one card at a time through the entire deck, winning an additional dollar whenever two cards match. How much should you be willing to pay to play this game? What is the probability that you lose your initial wager (i.e., no two cards match)? This is the “hat check problem” in not-so-clever disguise– $n$ people check their hat at a restaurant, and the hats are returned randomly; what is the expected number of people who get their own hat, and what is the probability that no one gets their own hat? Or, more abstractly, given a random permutation $\pi \in S_n$, what is the expected number of fixed points of $\pi$ (i.e., the number of elements $j$ such that $\pi(j)=j$), and what is the probability that $\pi$ has no fixed points? Both of these questions have “nice” answers: the expected number of fixed points is 1, independent of $n$, and the probability of no fixed points approaches $1/e$ very quickly as $n\to\infty$. (In other words, you should be willing to play$1 to play the card game above, with less than a 37% chance of losing money.)  The former result is a nice application of linearity of expectation, the latter of inclusion-exclusion.  But I think it is useful to find ways to present interesting mathematics like this to students in ways that require less machinery, such as the notions of “random variables,” “expected value,” or the fact that it is “linear.”

For example: imagine creating a large table, with $n!$ rows, one for each possible permutation (or arrangement of cards in a shuffled deck, or distribution of hats to customers, etc.).  Each row of the table contains a single number, the number of fixed points of the corresponding permutation.  We want to compute the average of these $n!$ numbers (i.e., compute their sum and divide by $n!$).  At first glance, this seems hard to do; the numbers in the table range from 0 to $n$, with varying numbers of permutations having each possible number of fixed points.

But now let’s expand the table, making it wider so that instead of just a single column of numbers, we have $n$ columns, with a 1 in column $j$ if $j$ is a fixed point of the corresponding permutation, and 0 otherwise.  If we sum the entries in each row, we get the original single-column table.  The useful trick is to instead sum the entries in each column: for each column $j$, there are $(n-1)!$ permutations that fix $j$.  Thus, the sum of entries in the table (i.e., the total number of 1s) is $n(n-1)! = n!$; dividing by the number of rows $n!$, the average number of fixed points is 1.

Using this same table of 0s and 1s, let’s re-label the columns by defining the subsets of permutations $F_1, F_2, ..., F_n$, where $F_j = \{\pi : \pi(j)=j\}$.  Then the entry in “row $\pi$, column $j$” is 1 if $\pi\in F_j$, and 0 otherwise.  The cardinality of each $F_j$ is $(n-1)!$… but more generally, the cardinality of the intersection of any $k$ of these subsets is $(n-k)!$ (why?).  This is what makes the inclusion-exclusion formula for the probability of no fixed points so elegant:

$\frac{1}{n!} \sum_{k=0}^n (-1)^k {n \choose k}(n-k)!$

$= 1 - \frac{1}{1!} + \frac{1}{2!} - \frac{1}{3!} + ... + (-1)^n \frac{1}{n!}$

Why is this useful?  So far, none of this is new.  But coming back now to the original problem of shuffling songs in a playlist: instead of fixed points of the form $\pi(j)=j$, we are interested in “coincidences,” or “adjacencies” of the form $\pi(j)+1 = \pi(j+1)$, where $\pi(j)$ is the position in the shuffled playing order of Song $j$.  We define the corresponding “adjacency-preserving” subsets $A_1, A_2, ..., A_n$, where $A_j = \{\pi : \pi(j)+1 = \pi(j+1)\}$, with the last subset $A_n = \{\pi: \pi(n)+1 = \pi(1)\}$ corresponding to viewing the natural order of songs as cyclic.

The key observation is that these subsets have structure very similar to the $F_j$; indeed, the cardinality of each $A_j$ is also $(n-1)!$, so that the “sum over columns” approach above yields an expected number of coincidences of 1, independent of $n$, if we view the natural song order cyclically (i.e., if we “allow” the coincidences indicated by $A_n$), or $(n-1)/n$ if we don’t.

More generally, the cardinality of the intersection of any $k$ of the $A_j$ is also $(n-k)!$except when $k=n$.  In that case, instead of exactly one permutation with all $n$ possible fixed points (i.e., the identity), there is no shuffled playlist with all $n$ possible coincidences (why?).

Fortunately, this has very little effect on the essential result: the inclusion-exclusion formula for the probability of no coincidences (allowing the cyclic one) looks just like that for no fixed points, except that it is missing the final $(-1)^n \frac{1}{n!}$ term.

## Puzzle: Randomly shuffling songs in iTunes

“Based on experience I feel there is some algorithm other than randomness at work.  I have 5K plus songs and to have 2 consecutive songs play from the same album or the same artist – well that’s not random.” – Disparate blog comment

There has been a lot of debate over the “randomness,” or possible lack thereof, in Apple’s “shuffle” feature in its iTunes software and iPod products.  Some of the debate stems from confusion about how the feature works, or the need to distinguish between selection with/without replacement.  But more interesting– to me, anyway– are comments like the one above, that illustrate just how poorly we humans perceive “true” randomness, searching for patterns and order where none exist.  Also, they make for a source of nice math problems.

So let’s turn this problem into a puzzle: suppose that you have a library of 5000 songs in some natural order, e.g. with titles “Song 1,” “Song 2,” “Song 3,” etc., up to “Song 5000.”  Now shuffle and play the entire library.  Define a coincidence to be a pair of consecutive songs (in the natural order) that are played consecutively in the shuffled order.  For example, if you hear songs 137, 138, and 139 in order, that is 2 coincidences (137 followed by 138, and 138 followed by 139).

What is the expected (i.e., average) number of observed coincidences?  What is the probability that no coincidences will occur?

A couple of notes:

1. A slight variant of this problem was asked, but not correctly answered, on the xkcd forum.
2. The problem becomes slightly “cleaner” if we view the song library as cyclic, so that song 5000 followed by song 1 is also considered a coincidence.  How does the solution change in this case?

Froelich, Duckworth, and Culhane, Does your iPod Really Play Favorites? The American Statistician, 63(3):263-268. (August 2009) [PDF]

Posted in Uncategorized | 3 Comments

## Visualizing Asteroid 2012 DA14

This week’s middle school programming class takes a slight detour into astronomy, using Python to visualize the path of asteroid 2012 DA14 as it buzzes Earth this Friday 15 February, approaching within less than 28,000 kilometers (about 17,000 miles) of our planet, “the closest ever predicted Earth approach for an object this large.”

How close is this?  What I found interesting during discussion about this fly-by was the sense of scale when comparing the “close shave” of the asteroid’s path to the orbits of various satellites, natural and artificial.  With just a few dozen lines of code, we can visually compare the size of these orbits with the path of the asteroid.  A still screenshot of the result is shown below, showing the asteroid passing inside of geostationary orbit, with a view of the orbits of GPS satellites and the International Space Station for comparison.

The asteroid’s path compared with various satellite orbits. The asteroid itself is shown at 20,000 times its actual size (“only” about half a football field in diameter).

Following is the source code, which animates the rotating earth and the very simplified motion of the asteroid.

from visual import *

# Draw earth.

# Draw DA14 asteroid at 20000x size.
da14 = sphere(pos=[34100, -30000, 0], radius=450)

# Draw geostationary orbit.
r_geo = 42164
ring(radius=r_geo, thickness=200, axis=[0, 1, 0], color=color.yellow)
label(pos=[r_geo, 0, 0], text='GEO')

# Draw GPS orbits.
r_gps = 26600
for angle in range(0, 360, 60):
axis = rotate(
label(pos=[r_gps, 0, 0], text='GPS')

# Draw ISS orbit.
r_iss = 6378 + 410
axis = rotate([0, 1, 0], radians(51.6), [0, 0, 1])
label(pos=[r_iss, r_iss, 0], text='ISS')

# Animate asteroid and rotating earth at 500x speed.
fps = 10
ff = 500
while True:
rate(fps)
da14.pos.y = da14.pos.y + 7.82 / fps * ff
earth.rotate(angle=radians(360.0 / 86400 / fps * ff), axis=[0, 1, 0])
if da14.pos.y > 30000:
da14.pos.y = -30000

Posted in Uncategorized | 1 Comment

## Hunt the Wumpus

<nostalgia>I had way too much fun with this.  After a discussion with students about mazes and various ways to represent and display them in computer programs, I ended up writing a Python version of Wumpus.  I don’t mean a 3D graphics version using Visual Python, or even the 1980 TI version with the Wumpus graphic that I see on t-shirts.  I mean the original text version by Gregory Yob.  I remember being fascinated by this game as a kid– it was very likely my first introduction to graph theory, with the rooms of the Wumpus’s cave connected like the vertices of a squashed dodecahedron.  (I love this helpful suggestion from the game’s instructions: “If you don’t know what a dodecahedron is, ask someone.”)

You can download the game at the usual location here.  Although there are many versions of Wumpus out there, I had a few specific requirements that none of them met.  This version still works in Python 2.7, and is a shot-for-shot re-make of the game, behaving exactly as it appeared where I first encountered it, re-printed in the 1979 book More BASIC Computer Games.  My only tweak was to use friendlier lower case text instead of the original all-upper case, and to fix a few typos in the re-print.</nostalgia>

Posted in Uncategorized | 1 Comment