This Riddler puzzle is about a random elimination game. Will someone remain at the end, or will everyone be eliminated?
In the first round, every subject simultaneously chooses a random other subject on the green. (It’s possible, of course, that some subjects will be chosen by more than one other subject.) Everybody chosen is eliminated. In each successive round, the subjects who are still in contention simultaneously choose a random remaining subject, and again everybody chosen is eliminated. If there is eventually exactly one subject remaining at the end of a round, he or she wins and heads straight to the castle for fêting. However, it’s also possible that everybody could be eliminated in the last round, in which case nobody wins and the king remains alone. If the kingdom has a population of 56,000 (not including the king), is it more likely that a prince or princess will be crowned or that nobody will win? How does the answer change for a kingdom of arbitrary size?
Here is my solution:
[Show Solution]
The key feature of this sequential elimination game is that the likelihood of each possible outcome only depends on how many people are playing. So we can define the vector $x_n$ to be the probability somebody will end up winning the game, given that $n$ people are currently playing. We have the base cases $x_0=0$ and $x_1=1$, and our goal is to figure out $x_{56000}$.
At every round, some number of people are eliminated. This continues until we have $0$ or $1$ people left. Define $B(m,n)$ to be the number of ways of having $m$ people left for the next round, given that we currently have $n$ people. We will now explain how to count this.
- Everyone is eliminated: this is the quantity $B(0,n)$. In order for this to happen, each person must get picked exactly once. The number of ways this can happen is precisely the number of permutations of $\{1,\dots,n\}$. But it’s a bit more complicated, because people cannot eliminate themselves. So we must count the number of permutations with no fixed points. These are called derangements. The total number of derangements satisfies the recursion:
\begin{align}
B(0,0) &= 1, \quad B(0,1) = 0\\
B(0,n+1) &= n B(0,n) + n B(0,n-1)\quad\text{for: }n=1,2,\dots
\end{align}There is also a closed-form formula that holds for $n\ge 1$, given by $B(0,n) = \lfloor \tfrac{n!}{e} + \tfrac{1}{2} \rfloor$. - Partial elimination: suppose we drop from $n$ people down to $m$ people with $m > 0$. This is the quantity $B(m,n)$. In this case, we must count the number of surjections from $\{1,\dots,n\}$ to itself such that (i) there are no fixed points and (ii) precisely $m$ of the elements are left untouched. We can count this recursively as well. Let $F(n,k)$ be the number of surjections from $\{1,\dots,n\}$ to $\{1,\dots,k\}$ with no fixed points. Let’s count how the number of surjections changes if we remove element $n$ from the pre-image. There are two cases: if we are left with a surjection from $\{1,\dots,n-1\}$ to $\{1,\dots,k\}$, i.e. the removed element was a redundant pairing, then it could have linked to any of the $k$ elements from the image. In other words, it contributed $k F(n-1,k)$ surjections. Alternatively, if the removed pairing was a critical link, then we will be left with a surjection from $\{1,\dots,n-1\}$ to a subset of $\{1,\dots,k\}$ with one element removed. The missing link must connect to the removed element, which can happen in $k$ ways ($k$ possible elements missing), so the net contribution is $k F(n-1,k-1)$. Consequently, we have the recursion: $F(n,k) = kF(n-1,k) + kF(n-1,k-1)$. Since we want to count all possible surjections, we can relate $F$ to $B$ via the formula: $B(m,n) = {n \choose m} F(n,n-m)$. Substituting and simplifying, we obtain a recursion only in terms of $B$’s:
\begin{align}
B(m,n) = \frac{n(n-m)}{m} B(m-1,n-1) + n B(m,n-1)
\end{align}
Let $A(m,n)$ be the probability that we transition from $n$ people to $m$ people in one move. To convert $B(m,n)$ to $A(m,n)$, we divide by the total number of possible assignments. Each person can pick one of the $(n-1)$ other people, so there are $(n-1)^n$ total choices possible. Therefore, $A(m,n) = (n-1)^{-n} B(m,n)$. Substituting into the formulas above, we can obtain a recursion for the $A$’s:
\begin{align}
A(0,0) &= 1,\quad A(0,1) = 0,\quad A(1,1) = 1,\quad A(n,n) = 0,\quad n=2,3,\dots \\
A(0,n+1) &= \tfrac{(n-1)^n}{n^n} A(0,n) + \tfrac{(n-2)^{n-1}}{n^n} A(0,n-1),\quad n=1,2,\dots\\
A(m,n+1) &= \tfrac{(n-1)^n(n+1)}{n^{n+1}} \left( \tfrac{n-m+1}{m} A(m-1,n) + A(m,n) \right),\quad m=1,2,\dots,n
\end{align}Using these formulas, we can compute any of the transition probabilities we like. Here is what the $A(m,n)$ matrix looks like for $0\le m,n \le 6$:
\[
A = \small\begin{bmatrix}
1.0000& 0& 1.0000& 0.2500& 0.1111& 0.0430& 0.0170\\
0& 1.0000& 0& 0.7500& 0.5926& 0.4102& 0.2458\\
0& 0& 0& 0& 0.2963& 0.4688& 0.5069\\
0& 0& 0& 0& 0& 0.0781& 0.2150\\
0& 0& 0& 0& 0& 0& 0.0154\\
0& 0& 0& 0& 0& 0& 0\\
0& 0& 0& 0& 0& 0& 0
\end{bmatrix}
\]So for example, if we had 4 people playing the game, there would be a 29.63% chance 2 people get eliminated, a 59.26% chance 3 people get eliminated, and a 11.11% chance everyone gets eliminated. Note that the columns of this matrix necessarily sum to 1. We can think of the game as a Markov Chain where $A$ matrix is the transition matrix. If the current distribution of the population is $z$ (i.e. $z_k$ is the probability that we have a population of $k$), then the distribution after one round of the game is found via the matrix multiplication $Az$. We want to know what happens after infinitely many rounds have been played. In other words, we want to find the stationary distribution.
The standard way to find a stationary distribution is to solve the eigenvector equation $x^T A = x^T$. Another way is by simply computing $A^k$ for some sufficiently large $k$. Upon doing this, we can plot the limiting distribution as a function of the initial population:
There is a fascinating oscillatory behavior that occurs as the population gets large. Namely, the probability of having a winner vs. having no winner oscillates about 0.5 with a period that increases exponentially. When plotted on a log-scale as above, the stationary distribution appears to converge to a sinusoid. I couldn’t go all the way up to 56,000 but the pattern that emerged was clear, so I extrapolated out what I couldn’t compute. The best-fit curve I found (empirically) for the probability of having an eventual winner for large $n$ is:
\[
x_n \approx 0.503+0.0265 \sin(14.5 \log_{10}(n)-1.27)
\]Substituting for $n=56000$, we find there is a 47.65% chance somebody wins and a 52.35% chance nobody wins.
Note: The limiting period can be explained as follows. The probability that one person doesn’t get picked is $(1-\frac{1}{n})^{n-1}$, so by linearity of expectation, we can expect to have $n(1-\frac{1}{n})^{n-1}$ people remaining after one round. In the limit $n\to\infty$, this is equal to $n e^{-1}$. Since the population roughly shrinks by a factor of $e$ at each iteration, we can expect the probability distribution to repeat with a period of $\log_{10}(e)$ on our plot. This explains why the number inside the sinusoid in our approximate formula turned out to be $\frac{2\pi}{\log_{10}(e)} \approx 14.5$.
Aside from the observation above, I couldn’t find a complete analytic characterization of the stationary distribution, so if anybody has ideas I’d love to hear them!