Secret Santa Revisited

Last week’s post presented two similar problems, the first being a simpler “standard” probability puzzle, and the second being rather more challenging.  Since the first problem was basically just a lead-in to this discussion, I won’t spend time on it here; there are plenty of good descriptions of solutions elsewhere (see here, for example).

The harder second problem asks essentially the same question as discussed in the Deranged Secret Santa post from last year: suppose that n participants in a Secret Santa gift exchange each put a slip of paper with their name on it into a single hat.  To determine gift-giving assignments, each participant in turn draws a slip from the hat, replacing it with another if he draws his own name.  What is the probability that this procedure “fails,” with the last person to draw finding his own name on the last slip in the hat?

I did some simulation and doodling on this problem last year, with limited success.  After revisiting the problem this year, there are a few more concrete and interesting results to report.

First, as usual, the problem has already been solved; the solution by Brian Parsonnet appears as sequence A136300 in the Online Encyclopedia of Integer Sequences.  (The sequence is parameterized by the number n of participants, where each integer value of the sequence is the numerator of the corresponding probability with denominator (n-1)!^2.)  However, the given recurrence relation is rather complex, and appears to require O(2^n) time and space to compute.

It turns out that we can solve the problem more simply and efficiently (in quadratic time) by transforming the problem slightly.  Note that the statement of the problem does not explicitly specify any particular order in which participants draw slips from the hat.  That is, it does not really matter who is the last person to draw; all that we want to know is the probability that the last person– whoever that may be– ends up drawing their own name.

At each turn, then, we can randomize the selection of the next person to draw, without affecting the desired probability.  (In the context of last week’s airplane boarding problem, the passengers do not stand in a line; they jostle in a group near the jetway entrance, and each next person to board is selected randomly.)  This helps a lot: given n total participants, let m be the number still waiting to draw, and let k \leq m be the number waiting to draw whose names have already been drawn.  Then the probability that the last person to draw is left with his own name is p(n,0), where p(m,k) is given by the following recurrence (with apologies for WordPress’s limited LaTeX multi-line formatting):

p(m,k) = \frac{k}{m}\left[\frac{k}{m} p(m-1,k-1) + (1 - \frac{k}{m}) p(m-1,k)\right]

+ (1 - \frac{k}{m})\left[\frac{k}{m-1} p(m-1,k) + (1 - \frac{k}{m-1}) p(m-1,k+1)\right]

p(1,k) = 1 - k

As the following plot shows, the probability of failure varies approximately as 1/n; the probability for n=100 in the original problem is approximately 0.0095018.

Probabiilty that the last person draws his own name vs. group size.

Finally, this slight transformation of the problem yields an interesting additional bonus: recall the “warning” given in last year’s post, suggesting that you probably do not want to use this procedure at all for your Secret Santa gift exchange.  The problem is that all possible permutations (specifically, all possible derangements) mapping gifters to recipients are not equally likely.  Even if you handle the small failure possibility of the last person drawing his own name by starting the whole procedure over, some participants are still more likely to draw certain names than others.  (As mentioned last year, the most extreme deviation is near the end of the line, with the last person to draw being nearly twice as likely to draw the next-to-last person as to draw the first person in line.  See the Mathematica source code below to calculate these probabilities.)

We can relax that warning a bit, based on the following observation: as in the puzzle solution above, if we modify the procedure slightly by first randomizing the order in which participants draw names… then the resulting random assignment does at least have the nice property that each person is equally likely to draw any other person’s name!

I verified this numerically for n \leq 8 using the code below, but I think this follows generally from a simple symmetry argument.  However, this is a weaker statement than the stronger claim that all derangements are equally likely, which still does not hold (n=4 is a minimal counterexample).

board[seats_, assigned_, prob_] :=
  {passenger, s, p},
  If[Length[seats] === 1,
   If[seats === {Length[assigned] + 1}, Return[]];
    (overall[[#2[[1]], #1]] += prob) &,
    Append[assigned, First[seats]]
   passenger = Length[assigned] + 1;
   s = DeleteCases[seats, passenger];
   p = 1/Length[s];
    board[DeleteCases[seats, #], Append[assigned, #], prob*p] &,

n = 10;
overall = Table[0, {n}, {n}];
board[Range[n], {}, 1];
This entry was posted in Uncategorized. Bookmark the permalink.

7 Responses to Secret Santa Revisited

  1. Pingback: Deranged Secret Santa | Possibly Wrong

  2. Pingback: Secret Santa Surprising Probability – Sunday Puzzle – Mind Your Decisions

  3. Pingback: Secret Santa puzzle | Possibly Wrong

  4. Cedric Mamo says:

    I am reposting my earlier comment due to html formatting errors:

    I tried tackling the problem myself and I pretty much ended with the same recurrence you came up with. The recurrence specified is valid, but some considerations can be made when trying to calculate it, for example when writing a program for calculating it, which I thought I should share.

    I’ll just repeat these here for clarity:

    N = the number of participants
    M = the number of participants still waiting to draw
    K = the number of participants still waiting to draw whose name was already drawn
    Q = the number of possible pairs of M and K (see below for this one)

    First of all it’s quite trivial to notice that, while the recurrence seems to recurse 4 times, P(M – 1, K) is needed twice, so that can be factored out. And boom, the formula recurses only 3 times now. So that’s already a significant speedup with very little effort.

    Second of all, you specified that K <= M. In reality, K can be significantly further constrained. The first obvious improvement is to notice that you can’t have less than 0 people whose name has already been drawn. So you can make that constraint 0 <= K <= M.

    Thirdly, since K is the number of people waiting to draw whose card has already been drawn, then K cannot be bigger than the number of people that have already drawn either. So K can be further constrained to 0 <= K <= MIN(M, N – M).

    If K is outside of that range, then whatever the result is, it will eventually get multiplied by 0 somewhere (in fact it must be, because it otherwise gives nonsensical answers, such as negative probabilities or probabilities > 1).

    So, although the recurrence above works out because of the eventual multiplications by 0, when writing a program to work it out, it’s also helpful to add these 2 rules to the recurrence, because it will drastically reduce the number of recursions that have to be made.

    P(M, K < 0) = 0
    P(M, K > MIN(M, N – M)) = 0

    Finally there is another way you can speed things up significantly. Kind of related to the first point, but across multiple levels of recursion, instead of only the current level. Because of the rule P(1, K) = 1 – K, we know that 1 <= M <= N, and due to the extra constraints added to K earlier, we also know that 0 <= K <= MIN(M, N – M). Another important thing is that M and K are both integers. They start out as integers and we either increment or decrement them, so they’ll always be integers. That means that the number of pairs of (M,K), which I’ll call Q, is finite, and when taking into account all the constraints above, you can work out that Q = ((-1)^N + 2 * N^2 – 1) / 8.

    This means that, given N people, the function p can only ever output q distinct results. This is helpful. Since Q < N^2, but the recurrence recurses 3 times, the number of times that the function P needs to run is much higher than Q. But since Q is the number of possible distinct results of P, that means that we’re still calculating a bunch of redundant stuff. So the calculation can be sped up very significantly, by simply trading some space for time. You can build a memoization table, indexed by M and K, storing the results of P(M,K), and it can only contain a maximum of Q items, where Q is quite reasonable so it won’t take up too much space. This will ensure that you never evaluate P more than Q times.

    Using all of the above optimizations, I managed to calculate P(M, 0) for 2 <= M <= 1000 with a C# program in less than 5 seconds on a 2.7GHz Intel Core i5-2450M.

  5. Cedric Mamo says:

    Futhermore, another advantage comes with the tighter restriction on K.

    I previously said that while you do not need to recurse for “nonsensical” values of k, it isn’t harmful to do so either, because the result gets multiplied by zero eventually. And that is true in the ideal world of pure mathematics.

    However, this formula will need to be calculated by a computer eventually. The above still holds if you’re trying to come up with an exact answer, by working out all intermediate values as Rational numbers (with a BigInt as the numerator and denominator). The problem occurs when you don’t care about an exact answer, and decide to work with floating point numbers.

    See, in math, X * 0 = 0 and that is *usually* the case too with floating point numbers.

    For “sensible” values of k (ones falling within the constraints I set out in my previous comment) the result is always a value [0,1].

    But for “nonsensical” values of k, the result could be anything and the formula stops producing valid values.

    Now here’s the thing. Suppose I implemented the above recursion and decided to speed things up by implementing the memoization table, but don’t restrict the values of K to only the sensible ones. So I build a table that can hold all values (1 <= M <= N, -M <= K <= M). The algorithm is still sped up significantly, with not that much more space needed. We previously said that X * 0 = 0. But the “nonsensical” values of K produce nonsensical outputs, which can actually result in them producing a NaN when working with floating point. In my case this starts occurring when N is somewhere around 450 when working with IEEE754 doubles. This here is where the math breaks down. For a computer, X * NaN = NaN.
    if X = 0, then that gives us 0 * NaN = NaN. Something multiplied by 0 isn’t equal to 0 anymore! Which means that the nonsensical answers aren’t cancelled out anymore!

    So basically, when working with floating point numbers, not only does restricting K properly give you a faster calculation, it also gives you a lot more numerical stability.

  6. Cedric Mamo says:

    In my program, implementing both the additional restrictions on k, as well as the memoization table, the exact algorithmic complexity I got in my final program (verified experimentally by counting memoization table misses) is

    O((2n^2 + 4n – 15 + (-1)^(N-1)) / 8)

    which for large values of N can simply be approximated as O(N^2). So the algorithm is indeed quadratic.

  7. Cedric Mamo says:

    I made a small mistake above. I made the mistake of forgetting to take into account that values of the form P(m,0) also need to be stored in the memoization table. So a revised formula for Q would be the following:

    Q = (2n^2 + 8n – 17 + (-1)^n) / 8

    So Q is the minimum number of evaluations of P that absolutely need to be made. One might notice that the algorithmic complexity mentioned in my previous comment is actually LESS than this. This is not an error. That is simply due to another slight optimization I made.

    Basically, for all P(2, K), the 2nd line of the recurrence (the part that looks like “(1 – (k/m))[…..]”) cancels out.

    So in my program I replaced the rule “P(1,K) = 1 – K” with “P(2, K) = (K / 2) * ((K / 2) * (2 – K) + (1 – (K / 2)) * (1 – K))” and simply disallowed all evaluations of the form P(1, K).

    So now I can’t calculate the odds of the game failing when there is only one participant anymore. But that’s not interesting since that probability is always 100%. But I save up on a little bit more redundant calculations. The difference is negligible in practice, but I wanted to opt for the absolute minimum number of calculations required. And when calculating with rationals, instead of floating point numbers, the difference is actually measurable for very large values of N (though still negligible, since the numbers involved are very small anyway)

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.