Are you a Bayesian or a frequentist? What do these terms mean, and what are the differences between the two? For me, these questions have never been terribly interesting, despite many attempts at answers given in the literature (see the references below for useful and entertaining examples).
My problem has been that explanations typically focus on the different approaches to expressing uncertainty, as opposed to different approaches to actually making decisions. That is, in my opinion, Bayesians and frequentists can argue all they want about what “the probability of an event” really means, and how much prior information the other camp has or hasn’t unjustifiably assumed… but when pressed to actually take an action, when money is on the table, everyone becomes a Bayesian.
Or do they? Following is an interesting puzzle that seems to more clearly distinguish the Bayesian from the frequentist, by forcing them both to put money on the table, so to speak:
Problem: You have once again been captured by bloodthirsty logical pirates, who threaten to make you walk the plank unless you can correctly predict the outcome of an experiment. The pirates show you a single irregularly-shaped gold doubloon selected from their booty, and tell you that when the coin is flipped, it has some fixed but unknown probability of coming up heads. The coin is then flipped 7 times, of which you observe 5 to be heads and 2 to be tails.
At this point, you must now bet your life on whether or not, in two subsequent flips of the coin, both will come up heads. If you predict correctly, you go free; if not, you walk the plank. Which outcome would you choose? (The pirates helpfully remind you that, if your choice is not to play, then you will walk the plank anyway.)
I think this is an interesting problem because two different but reasonable approaches yield two different answers. For example, the maximum likelihood estimate of the unknown probability that a single flip of the coin will come up heads is 5/7 (i.e., the observed fraction of flips that came up heads), and thus the probability that the next two consecutive flips will both come up heads is (5/7)*(5/7)=25/49, or slightly better than 1/2. So perhaps a frequentist would bet on two heads.
On the other hand, a Bayesian might begin with an assumed prior distribution on the unknown probability for a single coin flip, and update that distribution based on the observation of heads and tails. For example, using a “maximum entropy” uniform prior, the posterior probability for a single flip has a beta distribution with parameters , and so the probability of two consecutive heads is
where is the beta function. So perhaps a Bayesian would bet against two heads.
What would you do?
(A couple of comments: first, one might reasonably complain that observing just 7 coin flips is simply too small a sample to make a reasonably informed decision. However, the dilemma does not go away with a larger sample: suppose instead that you initially observe 17 heads and 7 tails, and are again asked to bet on whether the next two flips will come up heads. Still larger samples exist that present the same problem.
Second, a Bayesian might question the choice of a uniform prior, suggesting as another reasonable starting point the “non-informative” Jeffreys prior, which in this case is the beta distribution with parameters . This has a certain cynical appeal to it, since it effectively assumes that the pirates have selected a coin which is likely to be biased toward either heads or tails. Unfortunately, this also does not resolve the issue.)
1. Jaynes, E. T., Probability Theory: The Logic of Science. Cambridge: Cambridge University Press, 2003 [PDF]
2. Lindley, D. V. and Phillips, L. D., Inference for a Bernoulli Process (A Bayesian View), The American Statistician, 30:3 (August 1976), 112-119 [PDF]