## The clueless student: solution

This is to capture notes on my progress toward a solution to the “matching quiz” problem presented in the previous post, which can be re-stated more concisely as follows: what is the optimal strategy for choosing quiz answers specified as a function $f : [n] \rightarrow [n]$, not necessarily a bijection, that maximizes the probability of agreeing with a randomly chosen “solution” bijection in at least $m$ questions?

A strategy is characterized by a partition $\lambda$ of $n$, with the parts corresponding to the sizes of the classes of the equivalence kernel of the chosen function $f$.  The number of parts in the partition indicates the number of distinct quiz answers.  For example,

$1 + 1 + ... + 1 = n$

corresponds to a bijection, and the single-part partition $n=n$ corresponds to a constant function, i.e., writing the same answer for all $n$ questions.  Note that any function with the same partition “signature” has the same probability of success.

So, given a guessing strategy specified as a partition with $k$ parts:

$\lambda = \lambda_1 + \lambda_2 + ... + \lambda_k = n$

what is the probability of getting at least $m$ answers correct?  First, we can compute the probability $p(S)$ that exactly some subset $S \subseteq [k]$ of distinct answers is correct using inclusion-exclusion:

$p(S) = \frac{1}{n!} \sum\limits_{T \supseteq S} (-1)^{|T|-|S|}(n-|T|)! \prod\limits_{i \in T} \lambda_i$

And so the overall probability of at least $m$ correct answers is

$P(success) = \sum\limits_{|S| \geq m} p(S)$

We can collapse this sum of sums by observing that a particular subset $T \subseteq [k]$ in the inner summation appears ${|T| \choose |S|}$ times in the outer summation for a given size $|S|$.  Using the identity

$\sum\limits_{s=m}^t (-1)^{t-s} {t \choose s} = (-1)^{t-m} \frac{m}{t} {t \choose m}$

we can compute the overall probability of at least $m$ correct answers as

$P(success) = \frac{1}{n!} \sum\limits_{|T| \geq m} (-1)^{|T|-m} \frac{m}{|T|} {|T| \choose m} (n-|T|)! \prod\limits_{i \in T} \lambda_i$

At this point, I’m stuck; I don’t see a better approach than simply evaluating all possible strategies (i.e., generating all integer partitions of $n$) for small values of $1 \leq m \leq n$, and looking for patterns.  Following is the conjectured optimal strategy which holds for $n \leq 16$:

• If $m=1$, then as pointed out in the previous post, write down the same answer $n$ times, guaranteeing exactly one correct answer.
• For $m=2$, write one answer $\lfloor n/2 \rfloor$ times, and another (different) answer $\lceil n/2 \rceil$ times.
• For $m=n-1$, select an “almost-permutation,” corresponding to the integer partition $1+1+...+1+2=n$, so that exactly one answer gets duplicated (and thus exactly one answer is missing).
• Otherwise, select a random permutation.

Actually, we can describe this strategy more efficiently as follows: (a) select $m$ distinct answers, and distribute them as evenly as possible among the $n$ questions… unless $2, in which case (b) just select a random permutation.  I think (a) is what intuition would suggest is always the optimal approach– it was my wife’s immediate response when I described the problem to her– but I don’t have a good explanation for the qualification (b).

This entry was posted in Uncategorized. Bookmark the permalink.