“Based on experience I feel there is some algorithm other than randomness at work. I have 5K plus songs and to have 2 consecutive songs play from the same album or the same artist – well that’s not random.” – Disparate blog comment

There has been a lot of debate over the “randomness,” or possible lack thereof, in Apple’s “shuffle” feature in its iTunes software and iPod products. Some of the debate stems from confusion about how the feature works, or the need to distinguish between selection with/without replacement. But more interesting– to me, anyway– are comments like the one above, that illustrate just how poorly we humans perceive “true” randomness, searching for patterns and order where none exist. Also, they make for a source of nice math problems.

So let’s turn this problem into a puzzle: suppose that you have a library of 5000 songs in some natural order, e.g. with titles “Song 1,” “Song 2,” “Song 3,” etc., up to “Song 5000.” Now shuffle and play the entire library. Define a *coincidence* to be a pair of consecutive songs (in the natural order) that are played consecutively in the shuffled order. For example, if you hear songs 137, 138, and 139 in order, that is 2 coincidences (137 followed by 138, and 138 followed by 139).

What is the expected (i.e., average) number of observed coincidences? What is the probability that no coincidences will occur?

A couple of notes:

- A slight variant of this problem was asked, but not correctly answered, on the xkcd forum.
- The problem becomes slightly “cleaner” if we view the song library as cyclic, so that song 5000 followed by song 1 is also considered a coincidence. How does the solution change in this case?

Further reading:

Froelich, Duckworth, and Culhane, Does your iPod *Really* Play Favorites? *The American Statistician*, **63**(3):263-268. (August 2009) [PDF]

### Like this:

Like Loading...

*Related*

I think the “clean” variation is the same problem I came across the other day: you’re taking a test where you have to match up 24 presidents of the 19th century to a shuffled list of their terms. If you guess at random how many do you expect to get correct? The answer is 1 regardless of the number of presidents to be matched. For the playlist, when “drawing” for the shuffled playlist there is exactly one possible coincidence just as there is exactly one correct answer on the quiz. This only applies to the cyclic version where the last song has a possible coincidence.

The non-cyclic version is less than one and the expected number of coincidences depends on the total number of songs. I think it’s “1 – (1 / n)”.

As for the probability of no coincidences, a Monte Carlo approach suggests it’s something like 0.37.

You have hit on exactly why I like this problem– it isn’t quite “the same problem,” but it has a lot in common with the “match the presidents” test that you describe. Even including the probability part: that is, what is the probability that you get zero answers right on your randomly completed test? (Hint: it’s also “something like 0.37.”) The challenge is to prove this; there is an elegant approach that nicely handles both of these situations (i.e., shuffling playlists and matching presidents).

Pingback: Coincidences in random shuffling revisited | Possibly Wrong