This Riddler puzzle is about a random elimination game. Will someone remain at the end, or will everyone be eliminated?

In the first round, every subject simultaneously chooses a random other subject on the green. (It’s possible, of course, that some subjects will be chosen by more than one other subject.) Everybody chosen is eliminated. In each successive round, the subjects who are still in contention simultaneously choose a random remaining subject, and again everybody chosen is eliminated. If there is eventually exactly one subject remaining at the end of a round, he or she wins and heads straight to the castle for fêting. However, it’s also possible that everybody could be eliminated in the last round, in which case nobody wins and the king remains alone. If the kingdom has a population of 56,000 (not including the king), is it more likely that a prince or princess will be crowned or that nobody will win? How does the answer change for a kingdom of arbitrary size?

Here is my solution:

[Show Solution]

The key feature of this sequential elimination game is that the likelihood of each possible outcome only depends on how many people are playing. So we can define the vector $x_n$ to be the probability somebody will end up winning the game, given that $n$ people are currently playing. We have the base cases $x_0=0$ and $x_1=1$, and our goal is to figure out $x_{56000}$.

At every round, some number of people are eliminated. This continues until we have $0$ or $1$ people left. Define $B(m,n)$ to be the number of ways of having $m$ people left for the next round, given that we currently have $n$ people. We will now explain how to count this.

**Everyone is eliminated:**this is the quantity $B(0,n)$. In order for this to happen, each person must get picked exactly once. The number of ways this can happen is precisely the number of permutations of $\{1,\dots,n\}$. But it’s a bit more complicated, because people cannot eliminate themselves. So we must count the number of permutations with no fixed points. These are called derangements. The total number of derangements satisfies the recursion:

\begin{align}

B(0,0) &= 1, \quad B(0,1) = 0\\

B(0,n+1) &= n B(0,n) + n B(0,n-1)\quad\text{for: }n=1,2,\dots

\end{align}There is also a closed-form formula that holds for $n\ge 1$, given by $B(0,n) = \lfloor \tfrac{n!}{e} + \tfrac{1}{2} \rfloor$.**Partial elimination:**suppose we drop from $n$ people down to $m$ people with $m > 0$. This is the quantity $B(m,n)$. In this case, we must count the number of surjections from $\{1,\dots,n\}$ to itself such that (i) there are no fixed points and (ii) precisely $m$ of the elements are left untouched. We can count this recursively as well. Let $F(n,k)$ be the number of surjections from $\{1,\dots,n\}$ to $\{1,\dots,k\}$ with no fixed points. Let’s count how the number of surjections changes if we remove element $n$ from the pre-image. There are two cases: if we are left with a surjection from $\{1,\dots,n-1\}$ to $\{1,\dots,k\}$, i.e. the removed element was a redundant pairing, then it could have linked to any of the $k$ elements from the image. In other words, it contributed $k F(n-1,k)$ surjections. Alternatively, if the removed pairing was a critical link, then we will be left with a surjection from $\{1,\dots,n-1\}$ to a subset of $\{1,\dots,k\}$ with one element removed. The missing link must connect to the removed element, which can happen in $k$ ways ($k$ possible elements missing), so the net contribution is $k F(n-1,k-1)$. Consequently, we have the recursion: $F(n,k) = kF(n-1,k) + kF(n-1,k-1)$. Since we want to count all possible surjections, we can relate $F$ to $B$ via the formula: $B(m,n) = {n \choose m} F(n,n-m)$. Substituting and simplifying, we obtain a recursion only in terms of $B$’s:

\begin{align}

B(m,n) = \frac{n(n-m)}{m} B(m-1,n-1) + n B(m,n-1)

\end{align}

Let $A(m,n)$ be the probability that we transition from $n$ people to $m$ people in one move. To convert $B(m,n)$ to $A(m,n)$, we divide by the total number of possible assignments. Each person can pick one of the $(n-1)$ other people, so there are $(n-1)^n$ total choices possible. Therefore, $A(m,n) = (n-1)^{-n} B(m,n)$. Substituting into the formulas above, we can obtain a recursion for the $A$’s:

\begin{align}

A(0,0) &= 1,\quad A(0,1) = 0,\quad A(1,1) = 1,\quad A(n,n) = 0,\quad n=2,3,\dots \\

A(0,n+1) &= \tfrac{(n-1)^n}{n^n} A(0,n) + \tfrac{(n-2)^{n-1}}{n^n} A(0,n-1),\quad n=1,2,\dots\\

A(m,n+1) &= \tfrac{(n-1)^n(n+1)}{n^{n+1}} \left( \tfrac{n-m+1}{m} A(m-1,n) + A(m,n) \right),\quad m=1,2,\dots,n

\end{align}Using these formulas, we can compute any of the transition probabilities we like. Here is what the $A(m,n)$ matrix looks like for $0\le m,n \le 6$:

\[

A = \small\begin{bmatrix}

1.0000& 0& 1.0000& 0.2500& 0.1111& 0.0430& 0.0170\\

0& 1.0000& 0& 0.7500& 0.5926& 0.4102& 0.2458\\

0& 0& 0& 0& 0.2963& 0.4688& 0.5069\\

0& 0& 0& 0& 0& 0.0781& 0.2150\\

0& 0& 0& 0& 0& 0& 0.0154\\

0& 0& 0& 0& 0& 0& 0\\

0& 0& 0& 0& 0& 0& 0

\end{bmatrix}

\]So for example, if we had 4 people playing the game, there would be a 29.63% chance 2 people get eliminated, a 59.26% chance 3 people get eliminated, and a 11.11% chance everyone gets eliminated. Note that the columns of this matrix necessarily sum to 1. We can think of the game as a Markov Chain where $A$ matrix is the transition matrix. If the current distribution of the population is $z$ (i.e. $z_k$ is the probability that we have a population of $k$), then the distribution after one round of the game is found via the matrix multiplication $Az$. We want to know what happens after infinitely many rounds have been played. In other words, we want to find the stationary distribution.

The standard way to find a stationary distribution is to solve the eigenvector equation $x^T A = x^T$. Another way is by simply computing $A^k$ for some sufficiently large $k$. Upon doing this, we can plot the limiting distribution as a function of the initial population:

There is a fascinating oscillatory behavior that occurs as the population gets large. Namely, the probability of having a winner vs. having no winner oscillates about 0.5 with a period that increases exponentially. When plotted on a log-scale as above, the stationary distribution appears to converge to a sinusoid. I couldn’t go all the way up to 56,000 but the pattern that emerged was clear, so I extrapolated out what I couldn’t compute. The best-fit curve I found (empirically) for the probability of having an eventual winner for large $n$ is:

\[

x_n \approx 0.503+0.0265 \sin(14.5 \log_{10}(n)-1.27)

\]Substituting for $n=56000$, we find there is a 47.65% chance somebody wins and a 52.35% chance nobody wins.

**Note:** The limiting period can be explained as follows. The probability that one person doesn’t get picked is $(1-\frac{1}{n})^{n-1}$, so by linearity of expectation, we can expect to have $n(1-\frac{1}{n})^{n-1}$ people remaining after one round. In the limit $n\to\infty$, this is equal to $n e^{-1}$. Since the population roughly shrinks by a factor of $e$ at each iteration, we can expect the probability distribution to repeat with a period of $\log_{10}(e)$ on our plot. This explains why the number inside the sinusoid in our approximate formula turned out to be $\frac{2\pi}{\log_{10}(e)} \approx 14.5$.

Aside from the observation above, I couldn’t find a complete analytic characterization of the stationary distribution, so if anybody has ideas I’d love to hear them!

Isn’t it surprising that the pattern doesn’t get washed out in the noise? I just reasoned from the 1/e observation and some initial values, because any major decay of the wave happens early on, before the law of large numbers kicks in, and there are only 10ish rounds to worry about. Great riddler.

Yes it is surprising! My interpretation is that if you look at the distribution of where you’ll end up next move given your current position is n, the mean is n/e, but the variance is small enough that the distribution doesn’t get distorted as n gets large. Agreed — this was a great riddler and the solution I found was not at all what I was expecting to find.

Here’s what I want to know: Why 1/2? The average of the limiting upper and lower bounds as n -> ∞ appears to be very close to 1/2 (although your data and mine both suggest that it’s really about 0.503). Is that just a coincidence?

Great question — I also wondered about that. I suspect the oscillations continue on forever and the amplitude never dies out, but do they eventually become symmetric about 0.5, or will that 0.503 persist?

Great solution. I wanted to show whether the exponentially increasing period is related to the fact that the expected number of remaining participants after one round is equal to n/e.

I went for the brute force monte carlo solution with n=10,000. At each of the 10,000 games, I recorded the number of participants in each round. So in the first game, I recorded 56,000 for the first round, 20,679 for the 2nd round, 7585 for the 3rd, 2832 for the 4th, 1051 for the 5th, 375 for the 6th, 126 for the 7th, 46 for the 8th, 13 for the 9th, 3 for the 10th, and 1 for the 11th round, a successful game with a winner. I captured an overall histogram (with logarithmically sized bins) for the 10,000 games, plotted against the overall probability curve as a function of n, and posted to imgur, below. You can see that the distribution is cyclic corresponding to the subsequent rounds of each game, and repeatedly falls in a trough of the decaying oscillation.

I hope this helps to illuminate the phenomenon underlying the ongoing oscillation.

http://imgur.com/JOsijkE

thanks! fantastic illustration!

As always, I really enjoyed reading through your solution. Very insightful.

I was curious if more progress could be made towards an analytic solution (or computing probabilities for n > 5000) with an explicit formula for the transition matrix. I found the following two formulas:

1) A(m≥0,n≥m+2) = (n!/m!)/(n-1)^n * sum[!i/(i-1)! * {n-1, n-m}_i for i = 2:n-m]

where !n is the subfactorial function and {n,k}_r is an r-restricted Stirling Number of the second kind.

2) A(m≥0,n≥m+2) = (n-m)/(n-1)^n * C(n,m) * sum[(-1)^i * C(n-m-1,i) * (n-m-i)^(m+i-1) * (n-m-i-1)^(n-m-i) for i = 0 to n-m-2]

where C(n,m) is the binomial function.

In the first formula, the !i/(i-1)! term is approximately i*1/e for i > 3 which is consistent with the population decreasing by a factor of e each iteration. The second formula seems more opaque to me but is easier to compute because it avoids calculating the r-restricted Stirling numbers which are resource intensive.

Anyway, do these formulas offer any other insights into the problem or provide a path to an analytic solution?

Hmmm… I’m not sure. Although having a closed form expression for the transition matrix $A$ is nice, ultimately we need an efficient way of computing the top eigenvector of $A$, or an efficient way of computing powers of $A$. The reason this is computationally difficult is because the entries of $A$ get very small.

The reason I couldn’t go beyond 5000 in my solution wasn’t because I couldn’t compute $A$ efficiently. In fact, the recursion I used to compute $A$ was fast enough to take me well beyond 5000. The problem was that the entries of $A$ got so small that they were zero to machine precision.

Does that matter? I would have thought rounding them down to zero would not introduce significant error into the final result. I tried rounding to zero any entry less than 10^(-10) after every multiplication for N=100 (using Mathematica’s Chop command), and got the same final result to ten significant digits.

As you probably know, the Riddler has now posted their answer (quoting your nice graph!). Apparently it’s been proven that the result oscillates forever, but the precise parameters of the oscillation are not known.

As you’ve said, it’s a very surprising result. Amazing that such complexity can be hidden in such a simple algorithm.

A little late but here’s a Python code I wrote to directly estimate the probability that no one wins using Monte Carlo.

On my older machine, running a million simulations with N = 56000 took about 10 hours. This gives an accuracy of about 3 decimal points.

https://gist.github.com/ShiangYong/15c5cf6ddb6c64337d3b3eb791d6e95d