Someone at work this past week asked me about the problem of computing the probability that, in a group of people, some pair of persons share the same birthday. I am not sure what prompted the question, despite the interesting coincidence that in fact he and I share the same birthday.

The problem is a commonly used example of the failure of our intuition when dealing with probabilities in general and probabilities of “coincidences” in particular. A survey of university students asked for an estimate of the number of people required for the probability of a shared birthday to exceed 1/2 (see here). The median response was 385 people. Think about that– more than half of those surveyed guessed a number of people that is *guaranteed* to yield a shared birthday.

So what is the correct answer? I think this problem makes for a great experiment for students, since it can be approached in several ways. First, the mathematics: the desired probability is 1 minus the probability that all birthdays are distinct. Assuming all birthdays are equally likely (we’ll address this assumption shortly), we have

Next, the programming: students can write a function to compute , which requires both iteration and some thought about order of operations. It will not do to compute the numerator and denominator explicitly and then divide, at least in most languages that lack arbitrary-precision arithmetic.

A plot showing versus , and thus the answer to the problem, can be seen here. (<soapbox> I find it interesting and amusing when I hear comments about Wikipedia not being a useful source of information. Such comments seem to be based on content that is almost always political or religious. What does one expect in those cases? That is simply not the type of content for which the “anyone can edit” approach is useful. Particularly for technical information, I think Wikipedia contains a lot of useful *starting points*… that still require a brain in the head of the viewer to verify or extend the results found there.</soapbox>)

Finally, the experiments: the answer to the birthday problem may be sufficiently non-intuitive that students may not trust the mathematics. A classroom of 25-30 students, or better yet, collection of data from multiple classrooms, provides a handy experimental sample to corroborate the analysis. This is also a good time to come back to the uniformity assumption that all birthdays are equally likely. Students can do Monte Carlo analysis, writing programs to generate “classrooms” of data, possibly with non-uniform distribution of birthdays… and observe for themselves that such non-uniformity actually *increases* the likelihood of shared birthdays.

I was motivated to write this post by the Wikipedia article mentioned earlier. One small section describes a variant of this question that I think is more directly applicable to the cryptographic “birthday attack,” so named for its relation to this problem. Suppose that a movie theater is offering a free ticket to a new movie to the first person at the box office with the same birthday as someone earlier in line. Where should you stand in line? What is the expected position in line to win the free ticket?

This is also a nice problem for students, particularly because its exact solution does not require evaluation of *any* of the actual probabilities! In fact, the answer may be obtained from no more than the root of a quadratic. I won’t spoil it for those wanting to work out the solution, but I will point out that this is a good example, as mentioned above, where Wikipedia gets it wrong (at least at time of this writing). Unless I misunderstand the statement, the solution does not involve maximizing the difference between probabilities for consecutive positions in line.

I was once curious about whether birthdays are uniform so I found some data. It turns out that there is a periodic pattern if you graph it – weekdays are more common than weekends in any particular year. I guess doctors don’t want to do C-sections or inducements on the weekend 🙂

Of course this applies most to classrooms where students are likely to be the same age.

-bill

Interesting; I guess that makes sense. I was familiar with a seasonal variation (there is a good paper available here with this data and some other generalizations of the problem), but not over days of the week.

Pingback: More Mathematics in The Hunger Games | Possibly Wrong