The Riddler - Book Proofs

Microorganism multiplication

This Riddler is about microorganisms multiplying. Will they thrive or will the species go extinct?

At the beginning of time, there is a single microorganism. Each day, every member of this species either splits into two copies of itself or dies. If the probability of multiplication is p, what are the chances that this species goes extinct?

Here is my solution:
[Show Solution]

Here is a more technical (and more correct!) solution adapted from a comment by Bojan.
[Show Solution]

The previous approach obtains the correct answer, but the issue of which solution to the quadratic equation to pick still remains. This alternative approach obtains the correct answer using a generating function approach. Define the probability distribution:
\[
P_n(i) := (\text{probability there are }i\text{ microorganisms on day }n)
\]Now define the sequence of polynomials:
\[
Q_n(x) := \sum_{i=0}^\infty P_n(i)\, x^i
\]Then $Q_1(x) = x$, since there is one organism on day one. We can derive a recursive relation for the multiplication/exctinction process by thinking carefully about how the population can change from one day to the next. It’s not difficult to see that after the first timestep, the population must always be even. If the population is $2i$ at time $n+1$ and the population was $j$ at time $n$, then $j \ge i$. The probability that we transition from $j$ to $2i$ requires that exactly $i$ microorganisms multiply and the rest die. This can happen in ${j\choose i}$ ways, each with probability $p^i(1-p)^{j-i}$ (it’s a binomial distribution). Therefore, we have the recursion:
\[
P_{n+1}(2i) = \sum_{j=i}^\infty {j\choose i} p^i (1-p)^{j-i} P_n(j)
\]We can now use this fact to find a recursion for our generating function $Q_n.$ Again, we’ll make use of the fact that when $n>1$, there can only be an even number of microorganisms:
\begin{align}
Q_{n+1}(x)
&= \sum_{i=0}^\infty P_{n+1}(i)\, x^i\\
&= \sum_{i=0}^\infty P_{n+1}(2i)\, x^{2i}\\
&= \sum_{i=0}^\infty \sum_{j=i}^\infty {j\choose i} p^i (1-p)^{j-i} P_n(j)\, x^{2i} \\
&= \sum_{j=0}^\infty \sum_{i=0}^j {j\choose i} p^i (1-p)^{j-i} P_n(j)\, x^{2i} \\
&= \sum_{j=0}^\infty P_n(j) \sum_{i=0}^j {j\choose i} (px^2)^i (1-p)^{j-i} \\
&= \sum_{j=0}^\infty P_n(j) \left(px^2+(1-p)\right)^j \\
&= Q_n(px^2+(1-p))
\end{align}A couple notes: in the fourth step, we interchanged the order of summation for $i$ and $j$ and in the sixth step we used the binomial theorem. Summarizing, we conclude that
\[
Q_{n+1}(x) = Q_n(px^2+(1-p))
\]The probability of extinction after $n$ timesteps is $P_n(0)$ (the probability that the population is zero). We’re interested in the limit of $P_n(0)$ as $n$ goes to infinity. This is equivalent to the constant term in our sequence of polynomials, so we are interested in finding: $\lim_{n\to\infty} Q_n(0)$. Let’s take a look at the first few terms:
\begin{align}
Q_1(x) &= x\\
Q_2(x) &= px^2 + (1-p)\\
Q_3(x) &= p\left(px^2 + (1-p)\right)^2 + (1-p)\\
Q_4(x) &= p\left(p\left(\left(px^2 + (1-p)\right)^2 + (1-p)\right)^2 + (1-p)\right)^2 + (1-p)\\
\dots
\end{align}Evaluating at zero, we get:
\begin{align}
Q_1(0) &= 0\\
Q_2(0) &= 1-p\\
Q_3(0) &= p(1-p)^2 + (1-p)\\
Q_4(0) &= p\left( p(1-p)^2 + (1-p)\right)^2 + (1-p)\\
\dots
\end{align}The pattern is clear (we could also prove it using induction). If we define $q_n := Q_n(0)$, the recursion satisfied is:
\[
q_{n+1} = p\, q_n^2 + (1-p)\qquad\text{with: }q_1 = 0
\]Any limit $q_n \to q$ of this recursive sequence must satisfy the fixed point equation $q = p\,q^2+(1-p)$. This is precisely what we found in our first solution to the problem! So what’s the big deal? Why did we go through all this trouble to arrive at the same answer?? This recursion actually tells us something more that just the limiting case. It tells us how we approach the limiting case. In other words, it gives us the dynamics.

Let’s take a closer look at these dynamics in the neighborhood of the two fixed points $q=1$ and $q=\tfrac{1}{p}-1$. If we let $\delta_n := q_n-1$, we can rewrite the recursion as:
\[
\delta_{n+1} = 2p\, \delta_n + p\, \delta_n^2
\]Of course, if $\delta_n=0$ (i.e. $q_n=1$), then it will remain at $1$ forever. But what if we are merely close to this fixed point? Then $\delta_n \approx 0$ and we have $\delta_{n+1} \approx 2p\,\delta_n$. Notice that when $p > \tfrac{1}{2}$ If $\delta_n$ is close to zero, it will never get closer because $2p > 1$. These dynamics are unstable.

Now take a look at the other fixed point. If we let $\epsilon_n := q_n-\tfrac{1}{p}+1$, then we can rewrite the recursion as:
\[
\epsilon_{n+1} = 2(1-p)\epsilon_n + p\,\epsilon_n^2
\]If we are close to this fixed point, then $\epsilon_n \approx 0$ and we have $\epsilon_{n+1} \approx 2(1-p)\epsilon_n$. This time, when $p > \tfrac{1}{2}$, we have $2(1-p) < 1$. In other words, if we are close, we continue to get closer! These dynamics are stable.

This stability argument explains why we converge to $\tfrac{1}{p}-1$ rather than $1$ in the case where $p>\tfrac{1}{2}$. If we consider the other case, where $p<\tfrac{1}{2}$, the stability flips. The first fixed point becomes stable and the second one becomes unstable. This transition in stability of the dynamics explains the abrupt transition in the plot at the point $p=\tfrac{1}{2}$.

Timing a stoplight just right

This Riddler is about how to perfectly time a stoplight, something we’ve all had to deal with!

You are driving your car on a perfectly flat, straight road. You are the only one on the road and you can see anything ahead of you perfectly. At time t=0, you are at Point A, cruising along at a speed of 100 kilometers per hour, which is the speed limit for the whole road. You want to reach Point C, exactly 4 kilometers ahead, in the shortest time possible. But, at Point B, 2 kilometers ahead of you, there is a traffic light.

At time t=0, the light is green, but you don’t know how long it has been green. You do know that at the beginning of each second, there is a 1 percent chance that the light will turn yellow. Once it turns yellow, it remains yellow for 5 seconds and then turns red for 20 seconds. Your car can accelerate or decelerate at a maximum rate of 2 meters per second-squared. You must always drive at or below the speed limit. You can pass through the intersection when the traffic light is yellow, but not when it is red.

What is the best strategy to reach your destination as soon as possible?

Here is my solution:
[Show Solution]

Let’s give names to the variables so that the unit conversions don’t become cumbersome. Define:
\begin{align}
x_\text{light} &= \text{position of the traffic light (2 km)} \\
x_\text{end} &= \text{position of the finish line (4 km)} \\
v_\text{max} &= \text{speed limit (100 km/h)} \\
a_\text{max} &= \text{maximum acceleration (2 m/sec}^2\text{)} \\
\tau_y &= \text{time that a yellow light lasts (5 sec)} \\
\tau_r &= \text{time that a red light lasts (20 sec)}
\end{align}

It is implicitly assumed in the problem that we can never run a red light. So even though we don’t know when the light will turn yellow, we can’t take any chances; we must make sure that even the unlikeliest most ill-timed yellow light can never cause us to run a red.

There are many different approaches to solving this problem but since I’m currently teaching a class on optimization modeling, I figured I would go that route. The idea behind a modeling approach is to imagine that the position, velocity, and acceleration of the car are decision variables that must satisfy several constraints.

Basic variables and constraints

Let’s discretize the time interval and suppose we make decisions at every $\Delta t$ seconds. So if $\Delta t = 1$, we make decisions every second. Then, define the following decision variables:
\begin{align}
x_t & \ge 0 && \text{position at time }t\\
0 \le v_t &\le v_\text{max} && \text{speed at time }t\\
-a_\text{max} \le a_t &\le a_\text{max} && \text{acceleration at time }t
\end{align}where the time indices range over $t=0,1,\dots,T$ where $T$ is sufficiently large. Of course, we cannot choose positions and velocities arbitrarily; they must be consistent. By approximating the acceleration as constant over each $\Delta t$ step, we may write that:
\begin{align}
v_{t+1} &= v_t + a_t \Delta t && \text{(change in speed with constant accel.)}\\
x_{t+1} &= x_t + v_t\Delta t + \tfrac{1}{2}a_t \Delta t^2 && \text{(change in pos. with constant accel.)}
\end{align}We also have initial conditions for position and speed:
\[
x_0 = 0 \qquad\text{and}\qquad v_0 = v_\text{max}
\]Note that we constrained the speed to be nonnegative. It turns out this doesn’t matter and there is nothing to be gained by allowing backward movement. We get the same answer either way!

Contingency planning

Since we can never risk running a red light, we must ensure that at every instant, one of two things must happen. Either:

We can reach $x_\text{light}$ within $\tau_y$ sec. (we run the yellow) or:
We never pass $x_\text{light}$ within $\tau_y+\tau_r$ sec. (we can avoid running a red).

These constraints can be encoded algebraically. For each $t$, We’ll define a new set of positions, velocities, and accelerations that last for $\tau_y+\tau_r$ seconds. These are contingency variables that allow us to ensure nothing can go wrong if the light turns yellow at time $t$. So we define the variables:
\begin{align}
\hat x_{t,k} & \ge 0 && \text{(contingency position)}\\
0 \le \hat v_{t,k} &\le v_\text{max} && \text{(contingency velocity)}\\
-a_\text{max} \le \hat a_{t,k} &\le a_\text{max} && \text{(contingency acceleration)}
\end{align}These new contingency variables are defined over $0\le t \le T$ and $0 \le k \le \tau_y+\tau_r$. Just like the ordinary position and velocity, they must satisfy contingency constraints:
\begin{align}
\hat v_{t,k+1} &= \hat v_{t,k} + \hat a_{t,k} \Delta t && \text{(change in speed with const. accel.)}\\
\hat x_{t,k+1} &= \hat x_{t,k} + \hat v_{t,k} \Delta t + \tfrac{1}{2}\hat a_{t,k} \Delta t^2 && \text{(change in pos. with const. accel.)}
\end{align}We also have initial conditions for position and speed:
\[
\hat x_{t,0} = x_t \qquad\text{and}\qquad \hat v_{t,0} = v_t
\]Now that we have the contingency trajectories set up, we must encode our binary constraint. Either we can run the yellow, or we are safe from running the red. Let’s define binary variables:
\begin{align}
z^y_t \in \{0,1\} & \qquad \text{(can we run the yellow if light turns yellow at }t\text{?)}\\
z^r_t \in \{0,1\} & \qquad \text{(are we safe from running the red if light turns yellow at }t\text{?)}
\end{align}The constraints that we can run the yellow and not run a red are:
\begin{align}
\hat x_{t,\tau_y} &\ge x_\text{light} z^y_t &&\quad\text{for all }t\text{, and}\\
\hat x_{t,\tau_y+\tau_r} &\le x_\text{light} + (1-z^r_t)x_\text{end} &&\quad\text{for all }t\text{, respectively.}
\end{align}together with the logic constraint $z^y_t + z^r_t \ge 1$ for all $t$, which ensures that we can run the yellow OR we are safe from running the red. And that’s it!

Objective function

The work above describes the feasible set of position, velocity, and acceleration profiles that guarantee we never break the law. Among these profiles, we should choose the one that gets us to the finish line as fast as possible on average. I struggled to find a simple objective function that captures this notion precisely. As a compromise, I simply minimized the duration of the trip assuming the light never turns red. This should be a decent approximation to the true solution because yellow lights are relatively rare. Most of the time when the light turns yellow, it doesn’t actually affect us; it’s only a concern if the light turns yellow when we are close to it, which won’t happen all that often.

To characterize the finish time, I defined the binary variables:
\[
z^e_t \in \{0,1\} \qquad \text{(are we finished at time }t\text{?)}
\]together with the constraint:
\[
x_t \ge x_\text{end} z^e_t\qquad\text{for all }t.
\]Now, our binary variable will be $1$ whenever we have crossed the finish line. So by maximizing $\sum_{t=0}^T z^e_t$, we get to the end as soon as possible.

Solution

From a modeling standpoint, all our constraints are linear, which is nice. We do have some binary variables, which makes this a mixed-integer linear program (MILP). Nevertheless, the problem is relatively easy to solve using a standard solver. For this problem, I coded the model in Julia together with the JuMP package and the Gurobi solver. Some details:

Solving with $\Delta t = 1$. After presolve, the problem had: 3492 constraints, 6216 continuous variables, and 108 binary variables. Solving took about 0.3 seconds on a laptop.
Solving with $\Delta t = 0.5$. After presolve, the problem had: 15499 constraints, 29602 continuous variables, and 216 binary variables. Solving took about 1.3 seconds on a laptop.
Solving with $\Delta t = 0.25$. After presolve, the problem had: 63925 constraints, 125250 continuous variables, and 436 binary variables. Solving took about 12 seconds on a laptop.

No matter the time discretization, all solutions looked the same. Here is a plot of the solution for $\Delta t = 0.25$, zoomed in near the point where the car approaches the traffic light.

The plot looks constant outside this range. So, the optimal thing to do is maintain a speed of 100 km/h for 65 seconds, then decelerate as hard as possible for 2.5 seconds, then accelerate as hard as possible for 2.5 seconds, and then continue on at 100 km/h. The reason for the dip in velocity is that we would otherwise be unable to stop in time if the light were to turn yellow, so we must be risk-averse. After 67.5 seconds, we can safely accelerate, knowing that if the light turns yellow we will make it within the next 5 seconds.

If you’re interested in seeing my code in full detail, you can view/download my notebook here.

Impromptu gambling with dice

This Riddler puzzle is an impromptu gambling game about rolling dice.

You and I stumble across a 100-sided die in our local game shop. We know we need to have this die — there is no question about it — but we’re not quite sure what to do with it. So we devise a simple game: We keep rolling our new purchase until one roll shows a number smaller than the one before. Suppose I give you a dollar every time you roll. How much money do you expect to win?

Extra credit: What happens to the amount of money as the number of sides increases?

Here is my solution:
[Show Solution]

Suppose the die has $N$ sides. The key is to think about the problem recursively; the only thing that matters in determining the expected value is the number that was most recently rolled. Therefore, let’s define $f(n)$ to be the amount of money we expect to win if the last number we rolled was $n$. We’d like to solve for $f(1)$ since this is equivalent to the case where the game hasn’t started yet (any number subsequently rolled allows the game to continue). If we last rolled $n$, several cases emerge:

If we roll $k \in \{1,2,\dots,n-1\}$ we win \$1 and the game stops immediately. This will occur with probability $\tfrac{n-1}{N}$.
If we roll $k \in \{n,n+1,\dots,N\}$, we add \$1 to our winnings, and the game continues. We can expect to win an extra $f(k)$ as the game continues. Each number $k$ has an equal probability $\tfrac{1}{N}$ of being rolled.

Since each number $\{1,\dots,N\}$ has an equal chance to be rolled, we can write the following recursion for $f(n)$:
\[
f(n) = \frac{n-1}{N} + \frac{1}{N}\sum_{k=n}^N \left( 1+f(k)\right),\qquad n=1,2,\dots,N
\]Evaluating this for $n=N$, we obtain the relation
\[
f(N) = \frac{N-1}{N} + \frac{1}{N}\left( 1+f(N)\right)
\]Solving for $f(N)$, we obtain $f(N)=\tfrac{N}{N-1}$. This is our terminal condition: the expected winnings if we just rolled the highest possible number.

Now rewrite the original recursion for both $n+1$ and $n$ and subtract one equation from the other, we obtain:
\[
f(n+1)-f(n) = \frac{1}{N}-\frac{f(n)+1}{N}, \qquad n = 1,2,\dots,N-1
\]Rearranging, we obtain the simple recursion:
\[
f(n) = \left(\frac{N}{N-1}\right)f(n+1), \qquad n = 1,2,\dots,N-1
\]Since $N$ is a constant, we can recursively evaluate this and obtain:
\[
f(1) = \left(\frac{N}{N-1}\right)^{N-1} f(N)
\]Substituting the $f(N)$ we found earlier, we obtain our final answer:

$\displaystyle
f(1) = \left(\frac{N}{N-1}\right)^{N}
$

Note that $f(1)$ is undefined if $N=1$. This makes sense; if there is only one side to the die, you’ll always roll the same thing and so the game never ends! Here is a plot of how the expected winnings change as a function of $N$:

With a 100-sided die, we get an expected prize of $\left(\tfrac{100}{99}\right)^{100} \approx 2.7320$. As the number of sides increases, the expected prize decreases, but tends to a finite limit (the red line shown in the plot). In the limit $N\to\infty$, we have $f(1) \to e \approx 2.7183$.

For a more in-depth analysis of the distribution, read on:
[Show Solution]

Computing the distribution

Let’s calculate the probability $p(N,k)$ that the game with an $N$-sided die lasts $k$ rounds. We can do this by counting carefully:
\begin{align}
p(N,k) &= \frac{\text{#seq. of len. }k\text{ from }\{1,\dots,N\}\text{, first decrease occurs at end}}{\text{#sequences of length }k\text{ from }\{1,\dots,N\}}
\end{align}The denominator is simply $N^k$. The numerator is a bit trickier, and we can argue as follows. Let’s define $S(N,k)$ be the number of nondecreasing sequences of length $k$ from $\{1,\dots,N\}$. Then:
\begin{align}
&\text{#seq. of len. }k\text{ from }\{1,\dots,N\}\text{, first decrease occurs at end}\\
& \qquad = (\text{#seq. of len. }k\text{, first }k-1\text{ is nondecreasing})\\
&\qquad\qquad\qquad-(\text{#seq. of len. }k\text{, entirely nondecreasing})\\
& \qquad = N S(N,k-1)-S(N,k)
\end{align}So how do we count nondecreasing sequences? To calculate $S(N,k)$, imagine we have $a_1$ 1’s, $a_2$ 2’s, and so on up to $a_N$ $N$’s. Since the sequence is nondecreasing, these must occur in order. So we’re effectively looking for the number of ways of choosing $a_i \ge 0$ for $i=1,\dots,N$ such that $a_1+\dots+a_N = k$. We can do this via the stars and bars method.

The reasoning is as follows. Suppose we wanted to count the solutions to $a_1+a_2+a_3+a_4 = 5$ where each $a_i \ge 0$. Now imagine an arrangement of 5 stars and 3 bars, such as ★|★★★||★. Each star is “1” and the bars are separators. This arrangement represents the solution $1 + 3 + 0 + 1 = 5$. So the total number of ways of splitting 5 stars into 4 groups is the same as the number of ways of arranging 5 stars and 3 bars. Equivalently, it’s the number of ways of picking, out of 8 slots, which 3 will have bars (or stars) in them, so ${8 \choose 3} = 56$.

Therefore, we conclude that $S(N,k) = {N+k-1 \choose k}$. Substituting into our previous expression, we obtain after a bit of algebra:
\begin{align}
p(N,k) &= \frac{1}{N^k} \left( N S(N,k-1)-S(N,k) \right) \\
&= \frac{1}{N^k} \left( N{N+k-2 \choose k-1}-{N+k-1 \choose k}\right)\\
%&= \frac{1}{N^k} \left( \frac{(N+k-2)!N}{(k-1)!(N-1)!}-\frac{(N+k-1)!}{k!(N-1)!}\right)\\
%&= \frac{1}{N^k} \frac{(N+k-2)!}{(k-1)!(N-1)!}\left( N-\frac{N+k-1}{k} \right) \\
%&= \frac{1}{N^k} \frac{(N+k-2)!}{(k-1)!(N-1)!}\frac{(N-1)(k-1)}{k} \\
%&= \frac{1}{N^k} \frac{(N+k-2)!}{k!(N-2)!}(k-1) \\
&= \frac{k-1}{N^k} {N+k-2\choose k}
\end{align}So we conclude that:

$\displaystyle
p(N,k) = \frac{k-1}{N^k} {N+k-2\choose k}
$

As a sanity check, we can verify that $p(N,k)$ is a legitimate probability mass function for each $N$. In other words, the sum over $k$ is equal to $1$. We can check this using a neat telescoping sum argument:
\begin{align}
\sum_{k=2}^{\infty} p(N,k)
&= \sum_{k=2}^{\infty}\frac{ N S(N,k-1)-S(N,k) }{N^k}\\
&= \sum_{k=2}^{\infty}\left(\frac{S(N,k-1)}{N^{k-1}}-\frac{S(N,k)}{N^k}\right)\\
&= \frac{S(N,1)}{N}\\
&=1
\end{align}What we were after all along was the expected value, so we can compute this directly as well:
\begin{align}
\sum_{k=2}^{\infty} k\, p(N,k)
&= \sum_{k=2}^{\infty}k\frac{ N S(N,k-1)-S(N,k) }{N^k}\\
&= \sum_{k=2}^{\infty}\left[ \left(\frac{(k-1)S(N,k-1)}{N^{k-1}}-\frac{kS(N,k)}{N^k}\right) + \frac{S(N,k-1)}{N^{k-1}} \right]\\
&= \frac{S(N,1)}{N} + \sum_{k=1}^{\infty} \frac{S(N,k)}{N^{k}} \\
&= \sum_{k=0}^{\infty} \frac{1}{N^{k}} {N+k-1 \choose k}
\end{align}This final sum is an example of a binomial series. In general, we have:
\[
(1-z)^{-N} = \sum_{k=0}^{\infty} {N+k-1 \choose k} z^k\qquad\text{for all }|z|<1 \]If we let $z=\frac{1}{N}$, we recover the solution we derived in the first part for the expected winnings in the game.

Limiting distribution

To calculate the distribution as $N\to\infty$, we can make use of Stirling’s approximation, which says that $n! \sim \sqrt{2\pi n}\left(\frac{n}{e}\right)^n$ as $n\to \infty$.
\begin{align}
\lim_{N\to\infty} \frac{k-1}{N^k} {N+k-2\choose k}
&= \lim_{N\to\infty} \frac{k-1}{N^k} \frac{(N+k-2)!}{k!(N-2)!} \\
&= \lim_{N\to\infty} \frac{k-1}{k!} e^{-k} \frac{(N+k-2)^{N+k-2}}{N^k (N-2)^{N-2}} \\
&= \lim_{N\to\infty} \frac{k-1}{k!} e^{-k} \left( \frac{N+k-2}{N} \right)^k \left( \frac{N+k-2}{N-2} \right)^{N-2}\\
&= \lim_{N\to\infty} \frac{k-1}{k!} e^{-k} \left( \frac{N+k-2}{N-2} \right)^{N-2}\\
&= \frac{k-1}{k!}
\end{align}So the limiting distribution as $N\to\infty$ is given by:

$\displaystyle
p(\infty,k) = \frac{k-1}{k!}
$

Again, as a sanity check, $p(\infty,2) = \frac{1}{2}$. This makes sense because if your die has a large number of sides, your second number will be smaller than your first about half the time. Here is an animation of how the distribution changes as we add more sides to the die.

It was brought to my attention that there is already a nice write-up for this problem in this article. However, it’s behind a paywall. Here is an earlier draft of that same article that if freely available. If you’d like to see another solution method, I encourage you to check it out! Of note, the alternate solution also treats the case where the dice rolls are real numbers in $[0,1]$. This case is in fact equivalent to the case of having an $N$-sided die with $N\to\infty$.

Baby poker

Another game theory problem from the Riddler. This game is a simplified version of poker, but captures some interesting behaviors!

Baby poker is played by two players, each holding a single die in a cup. The game starts with each player anteing \$1. Then both shake their die, roll it, and look at their own die only. Player A can then either “call,” in which case both dice are shown and the player with the higher number wins the \$2 on the table, or Player A can “raise,” betting one more dollar. If A raises, then B has the option to either “call” by matching A’s second dollar, after which the higher number wins the \$4 on the table, or B can “fold,” in which case A wins but B is out only his original \$1. No other plays are made, and if the dice match, a called pot is split equally.

What is the optimal strategy for each player? Under those strategies, how much is a game of baby poker worth to Player A? In other words, how much should A pay B beforehand to make it a fair game?

If you’re interested in the derivation (and maybe learning about some game theory along the way), you can read my full solution here:
[Show Solution]

Our first task will be to figure out the payout of the game as a function of the players’ strategies. For each possible dice roll, the Player A must decide whether to call ($0$) or raise ($1$). Similarly, Player B must decide whether to call ($0$) or fold ($1$). An example strategy for Player A is:
\[
p = \begin{bmatrix} 0 & 0 & 0 & 0 & 1 & 1 \end{bmatrix}
\]This corresponds to Player A calling if they roll $\{1,2,3,4\}$ and raising if they roll $\{5,6\}.$ This is an example of a pure strategy because it completely determines what the player does as a function of the information available. We can encode each pure strategy as a six-digit binary number. There are $2^6 = 64$ possible pure strategies for each player and we can number them from $0$ to $63$ based on the decimal representation. For example, strategy $44$ corresponds to $101100$ in binary, i.e. strategy $p = \begin{bmatrix}1 & 0 & 1 & 1 & 0 & 0\end{bmatrix}$. We can also consider mixed strategies, where actions are probabilistic. An example of a mixed strategy for Player A is:
\[
p = \begin{bmatrix} 0.3 & 0 & 1 & 1 & 0 & 0 \end{bmatrix}
\]This strategy is identical to the previous example, except if Player A rolls a $1$, they will raise $30$% of the time and call $70$% of the time. Mixed strategies allows players to play randomly if they so desire! Pure strategies are special cases of mixed strategies in which the probabilities are either $0$ or $1$ (deterministic).

Let’s call $0 \le p_i \le 1$ the strategy of Player A if they roll $i \in \{1,\dots,6\}.$ Likewise, let’s call $0 \le q_j \le 1$ the strategy of Player B if they roll $j \in \{1,\dots,6\}.$ Define the $6\times 6$ matrix $E$ as:
\[
E_{ij} = \begin{cases}
1 &\text{if }i > j \\
0 &\text{if }i = j \\
-1 &\text{if }i < j \end{cases} \]This matrix tells us how much Player A wins if they call with an $i$ and the opponent rolls $j.$ Player A raises with probability $p_i$ and calls with probability $1-p_i.$ Similarly, Player B folds with probability $q_j$ and calls with probability $1-q_j.$ Since each pair $(i,j)$ is equally likely, the expected winnings for Player A are: \[ W = \frac{1}{36} \sum_{i=1}^6 \sum_{j=1}^6 \bigl( (1-p_i)E_{ij} + p_i\left( 2(1-q_j) E_{ij} + q_j \right) \bigr) \]The last term looks the way it does because if Player B folds, Player A wins $+1$ with certainty. If Player B calls, then we use the $E_{ij}$ matrix to determine the payoff, but with a factor of 2 because the stakes were doubled.

Pure strategy equilibria

Define $P_{ab}$ to be the winnings of Player A from the $W$ formula above when players adopt pure strategies $a,b \in \{0,1,\dots,63\}$ respectively. This yields the $64 \times 64$ matrix $P$ corresponding to the winnings for all possible combinations of pure strategies. Here is what $P$ looks like:

Player A is trying to maximize $P_{ab}$ while Player B is trying to minimize it (it’s a zero-sum game, so minimizing the winnings of one player is equivalent to maximizing the losses of the other player). Put another way, Player A selects a row of the matrix $P$ (corresponding to a choice among 64 pure strategies), while Player B selects a column (again, corresponding to a choice among pure strategies). The number in the chosen cell determines how much Player A will win. A Nash Equilibrium is a cell $P_{ab}$ that satisfies
\[
P_{\bar a b} \le P_{ab} \le P_{a \bar b}\qquad \text{for all }\bar a,\bar b \in \{1,\dots,64\}
\]In other words, $P_{ab}$ is the largest element in its column (so Player A has no incentive to pick a different row) and it’s also the smallest element in its row (so Player B has no incentive to pick a different column).

A natural question is to ask is: is there an optimal pair of pure strategies? This amounts to searching the above $P$ matrix for an element that is simultaneously the largest in its column and smallest in its row. A quick search reveals that no such element exists. In other words, there are no pure equilibria for this game!

Mixed strategy equilibria

Every mixed strategy can be expressed as a convex combination of pure strategies. For example:
\[
\begin{bmatrix}1\\\tfrac{1}{3}\\0\\0\\\tfrac{3}{4}\\1\end{bmatrix} =
\frac{1}{3}\begin{bmatrix}1\\1\\0\\0\\1\\1\end{bmatrix} +
\frac{5}{12}\begin{bmatrix}1\\0\\0\\0\\1\\1\end{bmatrix} +
\frac{1}{4}\begin{bmatrix}1\\0\\0\\0\\0\\1\end{bmatrix}
\]This follows from the fact that each mixed strategies lies in the convex hull of all pure strategies. Also, since the winnings $W$ are bilinear in $p$ and $q$, the winnings for a convex combination of strategies is the convex combination of the corresponding winnings.

Now let $u \in [0,1]^{64}$ and $v \in [0,1]^{64}$ be the coefficients of the convex combinations of the pure strategies for Players 1 and 2 respectively. The winnings of Player A can be written compactly as $u^\textsf{T} P v$. Player A would like to select $u$ in order to maximize this quantity, while Player B would like to select $v$ in order to minimize it. All this is subject to the constraint that $0 \le u_i \le 1$ for all $i$ and $u_1 + \dots + u_{64} = 1$, and similarly for $v$. Note that in the special case where $u$ and $v$ contain a single “1” and the rest is “0”, we recover the case of seeking pure strategies.

Nash famously proved that every game with finitely many players and actions has a mixed strategy equilibrium. So all that’s left to do is find it! The set of Nash equilibria can be expressed as the optimal set of a linear program. In particular, the optimal strategies $u,v$ are solutions to the dual linear programs:
\[
\begin{aligned}
\underset{u,\mu}{\max}\quad& \mu \\
\text{s.t.}\quad& u\ge 0,\quad \mathbf{1}^\textsf{T} u = 1 \\
& P^\textsf{T} u \ge \mu \mathbf{1}
\end{aligned}
\qquad
\begin{aligned}
= \qquad
\underset{v,t}{\min}\quad& t \\
\text{s.t.}\quad& v\ge 0,\quad \mathbf{1}^\textsf{T} v = 1 \\
& P v \le t \mathbf{1}
\end{aligned}
\]The first program (Player A) has a unique solution, which is the mixture:
\[
p_\text{opt} =
\frac{1}{3}\begin{bmatrix} 0 \\ 0 \\ 0 \\ 0 \\ 1 \\ 1 \end{bmatrix} +
\frac{2}{3}\begin{bmatrix} 1 \\ 0 \\ 0 \\ 0 \\ 1 \\ 1 \end{bmatrix} =
\begin{bmatrix} \tfrac{2}{3} \\ 0 \\ 0 \\ 0 \\ 1 \\ 1 \end{bmatrix}
\]The second program (Player B) the solution is not unique, and is rather complicated. As with any linear program, the solution set is a polytope. In this case, the polytope has 15 vertices and lives in a 7-dimensional subspace of $\mathbb{R}^{64}$ (neat, huh?). Projected back down to $\mathbb{R}^6$ where $q$ lives, every optimal solution occupies a 3-dimensional subspace. Specifically,
\[
q_\text{opt} \sim
\begin{bmatrix} 1 \\ x \\ y \\ z \\ 0 \\ 0 \end{bmatrix}
\]where $0 \le x,y \le 1$ and $0 \le z \le \tfrac{2}{3}$ and $x+y+z =\tfrac{4}{3}$.

Each optimal solution has the same optimal value (i.e. the optimal value of $\mu$ or $t$ in the linear programs for $u$ and $v$). The optimal value is $\tfrac{5}{54} \approx 0.0926$. So on average, Player A comes out ahead to the tune of about 9.26 cents when both players are playing optimally.

This alternate solution was proposed by a commenter named Chris. Same answer, but a simpler argument!
[Show Solution]

The solution proceeds much like the previous solution, but this time, we rewrite the cost as a quadratic form in the vectors $p$ and $q$:
\begin{align}
W &= \frac{1}{36} \sum_{i=1}^6 \sum_{j=1}^6 \bigl( (1-p_i)E_{ij} + p_i\left( 2(1-q_j) E_{ij} + q_j \right) \bigr) \\
&= \frac{1}{36} \left( p^\textsf{T}(E \mathbf{1}) + p^\textsf{T}(\textbf{1}\textbf{1}^\textsf{T}-2E)q \right) \\
&= p^\textsf{T} b + p^\textsf{T} A q
\end{align}where $b = \tfrac{1}{36}E\mathbf{1}$, and $A = \tfrac{1}{36}\left(\textbf{1}\textbf{1}^\textsf{T}-2E\right)$, and $\textbf{1}$ is the vector of all ones. We can now directly write linear programs for $p$ and $q$. The first player is trying to maximize $W$ under the assumption that the second player is trying to minimize $W$. In other words, we should solve:
\begin{align}
&\phantom{=}\underset{\textbf{0} \le p \le \textbf{1}}{\max}\quad\underset{\textbf{0} \le q \le \textbf{1}}{\min}\quad p^\textsf{T} b + p^\textsf{T} A q\\
&= \underset{\textbf{0} \le p \le \textbf{1}}{\max} \left( p^\textsf{T} b + \sum_{j=1}^6 \underset{0\le q_j \le 1}{\min} q_j (A^\textsf{T}p)_j \right) \\
&= \underset{\textbf{0} \le p \le \textbf{1}}{\max} \left( p^\textsf{T} b + \sum_{j=1}^6 \min\left( 0, (A^\textsf{T}p)_j \right) \right) \\
&= \left\{\begin{aligned}
\underset{p,t}{\text{maximize}}&\quad p^\textsf{T} b + t^\textsf{T} \textbf{1} \\
\text{subject to:}&\quad \textbf{0} \le p \le \textbf{1} \\
& t \le \textbf{0},\quad t \le A^\textsf{T} p
\end{aligned}\right.
\end{align}Likewise, the second player is trying to minimize $W$ under the assumption that the first player is trying to maximize $W$. We can proceed in a similar manner to above and we obtain:
\begin{align}
&\phantom{=}\underset{\textbf{0} \le q \le \textbf{1}}{\min}\quad\underset{\textbf{0} \le p \le \textbf{1}}{\max}\quad p^\textsf{T} b + p^\textsf{T} A q\\
&= \underset{\textbf{0} \le q \le \textbf{1}}{\min} \left( \sum_{i=1}^6 \max_{0\le p_i \le 1} p_i(Aq+b)_i \right) \\
&= \underset{\textbf{0} \le q \le \textbf{1}}{\min} \left( \sum_{i=1}^6 \max\left( 0, (Aq+b)_i \right) \right) \\
&= \left\{\begin{aligned}
\underset{q,\mu}{\text{minimize}}&\quad \mu^\textsf{T} \textbf{1} \\
\text{subject to:}&\quad \textbf{0} \le q \le \textbf{1} \\
& \mu \ge \textbf{0},\quad \mu \ge A q + b
\end{aligned}\right.
\end{align}Solving these two linear programs yields the same optimal $p$ and $q$ vectors as in the previous approach, but this time our linear programs have far fewer variables! The objective values of both LPs match as well and are both equal to $\tfrac{5}{54} \approx 0.926$, as before. In fact, we can check algebraically that the two LPs are dual to one another. Since both are clearly feasible (easy to check that the feasible set is compact and nonempty), strong duality holds and the two LPs must have the same optimal value. This is in essence Nash’s celebrated result; that the minimax and maximin formulations are equivalent!

If you’d just like to know the answer along with a brief explanation, here is the tl;dr version:
[Show Solution]

Fighting stormtroopers

This Riddler puzzle is about fighting a group of stormtroopers. Why are they so inaccurate anyway?

In Star Wars battles, sometimes a severely outnumbered force emerges victorious through superior skill. You panic when you see a group of nine stormtroopers coming at you from very far away in the distance. Fortunately, they are notoriously inaccurate with their blaster fire, with only a 0.1 percent chance of hitting you with each of their shots. You and each stormtrooper fire blasters at the same rate, but you are $K$ times as likely to hit your target with each shot. Furthermore, the stormtroopers walk in a tight formation, and can therefore create a larger target area. Specifically, if there are $N$ stormtroopers left, your chance of hitting one of them is $(K\sqrt{N})/1000$. Each shot has an independent probability of hitting and immediately taking out its target. For approximately what value of $K$ is this a fair match between you and the stormtroopers (where you have 50 percent chance of blasting all of them)?

Here is my solution.
[Show Solution]

Let’s call $q$ the probability that a stormtrooper blasts us. In the problem, $q=\tfrac{1}{1000}$. Let’s call $P(N)$ the probability that we will blast all of the $N$ stormtroopers and survive to tell the tale. Each round, several things can happen. We can blast a stormtrooper (or not) and we can get blasted ourselves (or not). One thing for sure is that we can only advance if we don’t get blasted.

The probability that we blast a stormtrooper is given as $q K\sqrt{N}$.
The probability that one stormtrooper doesn’t blast us is $1-q$ so the probability that we don’t get blasted by any of the $N$ stormtroopers is $\left(1-q\right)^N$.

Assembling these probabilities, we can write the following recursion:
\begin{align}
P(0) &= 1 \\
P(N) &= \left(1-q\right)^N\left[ \left(1-q K\sqrt{N}\right)P(N) + q K\sqrt{N}\, P(N-1) \right]
\end{align}The initial condition is that when there are no stormtroopers left, we have a 100% chance of blasting them all so $P(0)=1$. This recursion can be rearranged so that it’s in a bit easier to work with:
\[
P(N) = \left( \frac{\left(1-q\right)^N\left(q K\sqrt{N}\right)}{1-\left(1-q\right)^N\left(1-q K\sqrt{N}\right)}\right) \, P(N-1)
\]Now it’s clear that we can substitute $P(0)$ and solve for $P(1)$, and then solve for $P(2)$, and so on. So the probability is given by the formula
\[
P(N) = \prod_{n=1}^N \frac{\left(1-q\right)^n q K\sqrt{n}}{1-\left(1-q\right)^n\left(1-q K\sqrt{n}\right)}
\]

Exact formula

There doesn’t appear to be any way to simplify the product above, so we resort to a numerical solution. Here is a plot that shows the probability of blasting all the stormtroopers as a function of the number of stormtroopers $N$ and the blasting efficiency ratio $K$.

The plot above makes sense; our chances of blasting all the stormtroopers increases as $K$ increases (we get better at shooting them) or as $N$ decreases (there are fewer stormtroopers). We can also estimate that to blast 9 stormtroopers 50% of the time (without getting blasted ourselves!) we should have $K\approx 25$. In order to get a more precise estimate of $K$, we must do a bit more work.

Expanding the recursion as a function of $K$, the solution for $P(9)$ turns out to be the following rational function:
\[
P(9) \approx
\tfrac{K^9}{K^9+19.37 K^8+165 K^7+810 K^6+2524 K^5+5176 K^4+6972 K^3+5943 K^2+2904 K+619}
\]The only approximation I used here was in rounding the coefficients in the denominator. This function starts at zero when $K=0$ and increases monotonically and asymptotically to 1 as $K\to\infty$. This makes sense; the better we shoot, the likelier we are to blast all the stormtroopers! Here is a plot of $P(9)$ as a function of $K$:

Solving $P(9)=0.5$ numerically for $K$, we obtain $K\approx 26.797$. So in order to have a 50% chance of surviving an encounter with 9 stormtroopers, we should be roughly 27 times more accurate.

Note: It’s also possible that we blast all the stormtroopers but on our last shot, we also get blasted. I did not account for this possibility because I assumed that “blasting all the stormtroopers” meant that we also survived the encounter!

Approximate solution

In this scenario, $q$ is very close to zero and $N$ is small as well. So we can approximate $q^2 \approx 0$. This is effectively a first order Taylor expansion. This allows us to write $(1-q)^n \approx 1-nq$, for example. By applying $q^2 \approx 0$ in the original formula, we obtain the simpler product:
\[
P(N) = \prod_{n=1}^N \frac{K}{K + \sqrt{n}}
\]Note that $q$ actually cancels! The formula above represents the limit as $q\to 0$, which we’re assuming is a good enough approximation given that $q = \tfrac{1}{1000}$ is so small. Solving it numerically, we obtain $K \approx 26.707$, which is pretty close to the true solution of $K=26.797$. Unfortunately, this is still a problematic expression and there doesn’t appear to be a closed-form solution, but we can approximate further… Taking logs of both sides, we obtain:
\[
-\log P(N) = \sum_{n=1}^N \log\left( 1 + \tfrac{\sqrt{n}}{K} \right) \approx \frac{1}{K}\left(\sum_{n=1}^N \sqrt{n}\right)
\]The approximation we made stems from the fact that the expression on the right is of the form $\log(1+x)$ and $x$ is relatively small (we expect $K\approx 25$ and $n < 10$, therefore $x = \tfrac{\sqrt{n}}{K} < 0.12$). The approximation we used is a first order Taylor approximation $\log(1+x) \approx x$. If we want to find the $K$ such that $P(9) = 0.5$, we can solve this easily and obtain: \[ K = \frac{1}{\log 2} \sum_{n=1}^9 \sqrt{n} \approx 27.85 \]This is reasonably close to the $K\approx 26.707$ we were looking for and a lot easier to work with! The reason our approximation isn't better is because $x$ isn't actually that close to zero. To get a better approximation, we can use the second order Taylor approximation $\log(1+x)\approx x-\tfrac{1}{2}x^2$ and we obtain the equation:
\[
-\log P(N) \approx \frac{1}{K} \left( \sum_{n=1}^N \sqrt{n} \right)-\frac{1}{2K^2} \left( \sum_{n=1}^N n \right)
\]This is a simple quadratic equation in $K$ and solving it with $N=9$ and $P(9)=0.5$ yields the solution $K\approx 26.63$, which is much closer to the $K\approx 26.707$ we were approximating and still pretty close to the true solution $K=26.797$.

Tangled wires

This Riddler puzzle is about tangled wires. Can you figure out how to untangle them?

There are N wires leading from the top of a bell tower down to the ground floor. But as wires tend to do, these have become hopelessly tangled. Good thing the wires on the top floor are all labeled, from 1 to N. The wires on the bottom, however, are not. (In retrospect, somebody probably should have anticipated a tangle or two.)

You need to figure out which ground-floor wire’s end corresponds to which top-floor wire’s end. (The bulk of the wiring is hidden behind a wall, so you can’t simply untangle them.) On the ground floor, you can tie two wire ends together, and you can tie as many of these pairs as you like. You can then walk to the top floor and use a circuit detector to check whether any two of the wires at the top of the tower are connected or not. What is the smallest number of trips to the top of the tower you need to take in order to correctly label all the wires on the bottom?

Here is my solution.
[Show Solution]

It’s impossible to untangle the wires when $N=2$ because the circuit detector can’t tell us whether the wires are crossed or not. So at first glance, it may seem like this puzzle is unsolvable. Surely it can’t get easier as we add more wires, can it?

It can! In fact, for any $N \ge 3$, the wires can be untangled using exactly two trips. I will illustrate the solution via an example. Let’s assume $N$ is odd, and take the case $N=7$ with the wires tangled as shown below. Label the wires $\{1,2,\dots,7\}$ at the top and $\{A,B,\dots,G\}$ at the bottom.

On our first trip, we connect the wires in pairs as shown in purple. On our second trip, we connect the wires in pairs again, but shifted by one, as shown in dark blue. Here are the lists of connected pairs observed at the top of the tower when we use the circuit detector.
\begin{align}
\text{First trip:}&\quad\{(A,B), (C,D), (E,F)\}\mapsto\{(2,4),(1,6),(5,7)\} \\
\text{Second trip:}&\quad\{(B,C), (D,E), (F,G)\}\mapsto\{(4,6),(1,5),(3,7)\}
\end{align}Of course, we don’t know which pairs correspond to which wires. Consider the results of the first trip, for example. We know that $(A,B)$ corresponds to $(2,4)$ or $(1,6)$ or $(5,7)$, but not which one. Even if we knew that $(A,B)$ corresponded to $(2,4)$, we wouldn’t know whether $A=2$ and $B=4$ or vice versa. Nevertheless, the information we learned is useful because if we knew for example that $E=6$, then we could deduce that $F=1$ because $E$ and $F$ are connected and $1$ and $6$ form a pair.

In both trips combined, every wire was connected twice except $A$ and $G$, which were each connected once. Examining occurrences of numbers in both lists above, we see that $2$ and $3$ only appear once. These numbers therefore correspond to $A$ and $G$ in some order. We connected $A$ in our first trip, and this is the list where $2$ occurs, therefore $A = 2$. We can reconstruct the entire pattern by following the links:

$A=2$ and $(2,4)$ is in the first list, where we used $(A,B)$. So $B=4$.
$B=4$ and $(4,6)$ is in the second list, where we used $(B,C)$. So $C=6$.
$C=6$ and $(1,6)$ is in the first list, where we used $(C,D)$. So $D=1$.
$D=1$ and $(1,5)$ is in the second list, where we used $(D,E)$. So $E=5$.
$E=5$ and $(5,7)$ is in the first list, where we used $(E,F)$. So $F=7$.
$F=7$ and $(3,7)$ is in the second list, where we used $(F,G)$. So $G=3$.

This reconstruction works because the two endpoints $A$ and $G$ are visited on separate trips, and it’s clear how we would extend this to the case where $N$ is any odd number. If $N$ is even, the two endpoints will be visited on the same trip and we won’t be able to distinguish them. To remedy this situation, simply leave one wire unconnected and follow the procedure with the remaining wires. The wire that was never connected will then by elimination be associated with the number $\{1,\dots,N\}$ that never occurred.

The war game

This Riddler puzzle is about game theory… War or peace?

Two countries are eyeing each other’s gold. At the beginning of the game, the “strength” of each country’s army is drawn from a continuous uniform distribution and lies somewhere between 0 (very weak) and 1 (very strong). Each country knows its own strength but not that of its opponent. The countries observe their own strength and then simultaneously announce “peace” or “war.”

If both announce “peace,” then they each stay quietly in their own territory, with their own gold, which is worth \$1 trillion (so each “wins” \$1 trillion). If at least one announces “war,” then they go to war, and the country with the stronger army wins the other’s gold. (That is, the stronger country wins \$2 trillion, and the other wins \$0.)

What is the optimal strategy of each country (declaring “peace” or “war”) given its strength?

Extra credit: What if the countries don’t announce at the same time and instead one announces first and the other second? What if the value of winning the war were \$5 trillion rather than \$2 trillion?

Here is my solution for the first part, where both countries declare their intentions simultaneously.
[Show Solution]

Pure, mixed, and threshold strategies

This problem describes what is known as a Bayesian game. Each country must make a decision based only on the strength of its army, which is a random variable. One country’s decision rule might look something like this: “If my army has a strength less than $a$, then declare peace. Otherwise, declare war”. This is a “threshold strategy” where the threshold value $a$ determines how likely the country is to go to war. The threshold strategy is an example of a pure strategy because the country behaves the same way every time it sees the same army strength. A more general strategy would be a mixed strategy, where the country decides on a probability of going to war that depends on the observed army strength.

Although neither country gets to know the other country’s strategy, we can still ask the question: suppose country A knows country B’s strategy, what would country A’s best response be? Similarly, we can ask what country B’s best response would be if they knew country A’s strategy. Suppose country A uses strategy $f_A$ and country B uses strategy $f_B$. If the best response to strategy $f_A$ is $f_B$ and the best response to strategy $f_B$ is $f_A$, then we have what is known as a Nash equilibrium. This means that neither country has an incentive to alter its strategy. This is what it means to have an optimal strategy.

Threshold strategies are optimal

Let’s look for a Nash equilibrium for this game. Define the following:

$x$ is the strength of the first country’s army. We’ll assume a mixed strategy defined by the function $p(x)$, which is the probability that the country declares peace given that its army has strength $x$.
$y$ is the strength of the second country’s army. We’ll assume a mixed strategy defined by the function $q(y)$, which is the probability that the country declares peace given that its army has strength $y$.
For the payoffs, we’ll assume mutual peace results in no change, while war results in $+W$ for the winner and $-L$ for the loser.

Given the above, the first country can expect to win:
\[
J_1 = \int_0^1 \int_0^1 (1-p(x)q(y))\, \text{war}(x-y)\, dy\,dx
\]where we defined:
\[
\text{war}(t) = \begin{cases}
+W&\text{if }t \ge 0\\
-L&\text{if }t < 0 \end{cases} \]This follows because the probability of going to war is $(1-p(x)q(y))$, i.e. 1 minus the probability that both countries declare peace. Let's now ask ourselves what happens if we fix $q$ (the second country's strategy) and try to find $p$ in order to maximize $J_1$. Expanding, we obtain: \begin{align} J_1 &= \int_0^1 \int_0^1 (1 - p(x) q(y))\, \text{war}(x-y)\, dy\,dx \\ &= \tfrac{W-L}{2}-\int_0^1 p(x) \int_0^1 q(y)\, \text{war}(x-y)\, dy\,dx \\ &= \tfrac{W-L}{2}-\int_0^1 p(x) \left( W\!\int_0^x q(y)\,dy-L\!\int_x^1 q(y)\,dy\right) dx \end{align}Inspecting the quantity in brackets, we notice that it is a nondecreasing function of $x$. When $x=0$, it's equal to $-L\!\int_0^1 q(y)dy$ and when $x=1$, it's equal to $W\!\int_0^1q(y)dy$. Therefore there is some value $x=a$ where the bracketed quantity is zero. In order to maximize $J_1$, we should set: \[ p(x) = \begin{cases} 1 & \text{if }x < a \\ 0 & \text{otherwise} \end{cases} \]In other words, the first country should use a threshold policy no matter what the second country’s policy is! By symmetry, the same argument holds if we fix the first country’s policy and optimize the second country’s policy; both countries should always use threshold policies.

The solution is war!

We have established that both countries should use threshold strategies. So let’s suppose the second country uses the threshold $b$. What threshold should the first country use? We can compute this by solving the following equation for $a$:
\[
W\!\int_0^a q(y)\,dy = L\!\int_a^1 q(y)\,dy
\]Substituting the threshold policy for $q(y)$, we deduce that $a \le b$, since we must have a positive quantity on both sides of the equality. The result is that $a W = (b-a) L$. In other words:
\[
a = \left( \tfrac{L}{W+L} \right) b
\]So if $L=W=1$, the optimal threshold for the first country is half the optimal threshold for the second country. In fact, no matter what $L$ and $W$ are, the first country’s optimal strategy is to set a threshold that is smaller than the threshold of the other country. The exact same argument holds if we reverse the roles of the countries and we fix the first country’s strategy instead. We therefore conclude that each country’s best response is to continually lower its threshold. The only way to achieve a (Nash) equilibrium is if both countries set their thresholds at zero. Put another way, both countries should always declare war.

Here is my solution for the second part, where the countries declare their intentions sequentially.
[Show Solution]

If the countries declare their intentions sequentially, the second country gets to see the first country’s declaration before making its own, so it may use this information in its decision-making process. We can solve this version of the problem similarly to how we solved the first version, by looking for a Nash equilibrium.

Threshold strategies are still optimal

Again, we assume the first country uses a mixed strategy $p(x)$. In principle, the second country can have two different strategies depending on what the first country declares, but if the first country declares war then nothing can be done to prevent war. So let $q(y)$ be the second country’s strategy when the first country declares peace.

The first country’s expected profit is the same as it was before, namely:
\[
J_1 = \int_0^1 \int_0^1 (1 – p(x) q(y))\, \text{war}(x-y)\, dy\,dx
\]Therefore we can draw the same conclusion as before: the first country should use a threshold strategy. The second country’s expected profit is different, since we must now condition on the first country declaring peace. This time, the probability of war is only $(1-q(y))$, and the density function of $x$ conditioned on the first country declaring peace is given by Bayes’ theorem:
\[
\mathbb{P}(x\,|\,\text{peace}) = \frac{\mathbb{P}(\text{peace}\,|\,x) \mathbb{P}(x) }{\int_0^1 \mathbb{P}(\text{peace}\,|\,x) \mathbb{P}(x) dx}
= \frac{p(x)}{\int_0^1 p(x) dx}
\]The expected profit for the second country is therefore:
\[
J_2 = \frac{\int_0^1 \int_0^1 (1 – q(y))\, \text{war}(y-x) p(x) \, dx\,dy}{\int_0^1 p(x)dx}
\]It’s not difficult to see that $J_2$ has the same optimal $q$ as it did before. Once again, a threshold strategy is optimal, and the threshold is the same as before.

War is always the answer?

Since both countries use the same threshold strategies as in the original problem, the Nash equilibrium is the same; the first country should always declare war. Therefore, it doesn’t matter what the second country does; war is inevitable. This result is somewhat paradoxical because if the first country should always declare war, then what happens if it declares peace? The second country will be thoroughly confused!

Unmasking the Secret Santas

This Riddler puzzle is about the popular Secret Santa gift exchange game. Can we guess who our Secret Santa is?

The 41 FiveThirtyEight staff members have decided to send gifts to each other as part of a Secret Santa program. Each person is randomly assigned one of the other 40 people on the masthead to give a gift to, and they can’t give to themselves. After the Secret Santa is over, everybody naturally wants to find out who gave them their gift. So, each of them decides to ask up to 20 people who they were a Secret Santa for. If they can’t find the person who gave them the gift within 20 tries, they give up. (Twenty co-workers is a lot of co-workers to talk to, after all.) Each person asks and answers individually — they don’t tell who anyone else’s Secret Santa is. Also, nobody asks any question other than “Who were you Secret Santa for?”

If each person asks questions optimally, giving themselves the best chance to unmask their Secret Santa, what is the probability that everyone finds out who their Secret Santa was? And what is this optimal strategy? (Asking randomly won’t work, because only half the people will find their Secret Santa that way on average, and there’s about a 1-in-2 trillion chance that everyone will know.)

Here is my solution:
[Show Solution]

In order to generalize the problem, let’s assume there are $2n+1$ staff members in total, and each one is allowed to ask $n$ co-workers who they were Secret Santa for. The problem asks about $n=20$, but we’ll examine the general case here.

A key aspect of this problem is that there is no communication among co-workers. Each staff member independent seeks to discover their Secret Santa and can’t use information gleaned by other co-workers. This assumption actually simplifies the problem greatly; from the point of view of the individual asking questions, there are two classes of co-workers: (1) those we know nothing about and (2) those for whom we already know either their Secret Santa or who they are Secret Santa for. A strategy consists of a sequence of decisions where we must either query somebody from group (1) or somebody from group (2). The lack of collusion among co-workers means that each staff member must use the same strategy. Consequently, there aren’t actually that many different possible strategies! Let’s look at each one in turn.

Purely random strategy

The simplest strategy is to choose $n$ co-workers at random. Each co-worker has a $\tfrac{1}{2n}$ chance of being your Secret Santa, so the probability that you will find your Secret Santa is $\tfrac{1}{2}$, regardless of the number of people. Since everybody must use the same strategy, the probability of everybody finding their Secret Santa is:
\[
P_\text{rand} = \frac{1}{2^{2n+1}}
\]This gets small very quickly as $n$ increases. When $n=20$, we have $P_\text{rand} = 4.55\times10^{-13}$ (about 1 in 2 trillion, as mentioned in the problem statement). A very small probability indeed. The random strategy maximizes an individual’s chance of finding their Secret Santa, but it’s a horrible choice if the goal is to maximize the chance that everybody finds their Secret Santa.

Deterministic strategy

If not picking randomly, there is only one alternative: to ask the co-worker you gifted! Once they reveal who they gifted, then you ask that person, and so on. You can imagine a graph where each node is a staff member, and you draw an arrow connecting each Secret Santa to their target. Once this is done, each node has exactly one incoming arrow (from their Secret Santa) and one outgoing arrow (to their target). An example of such a graph is shown below.

Notice that the nodes form closed loops. There could be a single loop or multiple smaller loops, as shown above. Before we tackle the general case, let’s see what happens if the group is small.

Small group ($n=1$ or $n=2$): In the case $n=1$, there are three staff members, and they must necessarily be in a single cycle of length 3. In this case, we immediately know our Secret Santa; it’s the person we didn’t give a gift to! We don’t even need to ask any questions to know this. In the case $n=2$, there are five staff members. If they form a cycle of length 5, we can follow the Santa until there is a single co-worker left out. Since there can be no cycles of length 4, the co-worker that was left out must be our Secret Santa! Once again, we can deduce our Secret Santa with perfect accuracy even if we don’t directly ask them the question.

Larger group ($n\ge 2$): If the group is larger, the deductive arguments used previously no longer work because there are too many co-workers left over after we’ve asked all our questions. Each staff member will find their Secret Santa only if all the loops in the are sufficiently small. Since $n$ questions are asked, each loop must have at most $n+1$ members. It turns out counting big loops is easier than counting small ones. So the probability we’re after is:
\begin{align}
P_\text{determ}
&= \frac{\text{Assignments where each loop has at most }n+1\text{ nodes}}{\text{Total number of assignments}}\\
&= 1-\frac{\text{Assignments containing a loop with at least }n+2\text{ nodes}}{\text{Total number of assignments}}\\
&= 1-\frac{\sum_{k=n+2}^{2n+1}\left(\text{Assignments containing a loop with }k\text{ nodes}\right)}{\text{Total number of assignments}}
\end{align}The total number of assignments is the number of permutations of $\{1,\dots,2n+1\}$ such that there are no fixed points, since nobody is allowed to be their own Secret Santa (wouldn’t be much of a secret). This is precisely the number of derangements $D_{2n+1}$, a concept we discussed in the Lonesome King puzzle.

For the numerator, we must count how many ways we can have a loop of length $k$. Since $k \ge n+2$, there are at most $n-1$ nodes left over. Therefore there can be no more than one loop of length $k$. There is ${2n+1\choose k}$ ways of picking our loop, $(k-1)!$ ways of ordering the elements in the loop, and $D_{2n+1-k}$ ways of arranging the remaining nodes that weren’t part of the loop. Note that we don’t risk double counting in this manner because there aren’t enough nodes remaining to produce loops of length $k$ or greater. The final result is:
\[
P_\text{determ} = 1-\frac{\sum_{k=n+2}^{2n+1}{2n+1 \choose k}(k-1)!\,D_{2n+1-k}}{D_{2n+1}}
\]In the case $n=20$, this evaluates to
\begin{align}
P_\text{determ} &= \tfrac{97939699570365313025758987280981560791683250767}{307662419905587585654556899376849633804439730767}\\
&\approx 0.318335
\end{align}so a probability of about 31.8%. Not bad! Here is a plot of how the probability changes as a function of $n$.

As discussed above, the probability is equal to 1 when $n=1$ or $n=2$, since we can deduce the Secret Santa in those cases even if we ask the wrong co-workers. Curiously, the case $n=3$ (7 people) has a lower probability than the case $n=4$ (9 people). As $n\to\infty$, the probability tends to a finite limit. To find this limit, we can use the derangement formula $D_n = \left[\frac{n!}{e}\right]$, where $[x]$ is the nearest integer to $x$. We then have:
\begin{align}
P_\text{determ} &= 1-\frac{\sum_{k=n+2}^{2n+1}{2n+1 \choose k}(k-1)!\,D_{2n+1-k}}{D_{2n+1}} \\
&\approx 1-\frac{\sum_{k=n+2}^{2n+1}{2n+1 \choose k}(k-1)!(2n+1-k)!}{(2n+1)!}\\
&= 1-\sum_{k=n+2}^{2n+1}\frac{1}{k}\\
&\approx 1-\log(2)\\
&\approx 0.306853
\end{align}As $n\to\infty$, this approximation becomes exact and the limiting probability is $1-\log(2)$, or about 30.7%. This can be shown rigorously by using upper and lower bounds for the derangements; i.e. $\frac{k!}{e}-\tfrac12 \le D_k \le \frac{k!}{e}+\tfrac12$.

In the derivation above, we calculated the probability that everybody wins when everybody uses the deterministic strategy. If we’re interested in the probability that an individual wins using this strategy, we can observe that an individual wins as long as they don’t belong to the large chain of length $k$ in the summation above. So the probability that an individual wins is given by:
\begin{align}
P_{\text{determ}}^{\text{individual}} &= 1-\frac{\sum_{k=n+2}^{2n+1}{2n+1 \choose k}(k-1)!\,D_{2n+1-k}\tfrac{k}{2n+1}}{D_{2n+1}}
\end{align}Here is what a plot of this looks like as a function of $n$:

When $n > 2$, the probability of an individual winning is always less than 50%. When $n=20$, the probability that an individual wins is about 0.487805, or 48.8%. The limit as $n\to\infty$ can be computed as before and we obtain $\tfrac{1}{2}$ this time. So in conclusion, the deterministic strategy is slightly worse than random guessing from the individual’s perspective, but it yields the best possible chance of everybody finding their Secret Santa.

Hybrid strategy

The nice thing about the purely random strategy is that no matter what the assignment looks like, there is always some chance that everyone will find their Secret Santa. In fact, the chance is the same for every assignment. Unfortunately, that chance is very low. In contrast, the deterministic strategy’s success depends heavily on the assignment. In some cases it works for everybody and in other cases it only works for some.

In principle, one could design a hybrid strategy, e.g. using a deterministic strategy for some number of moves and then playing the remaining moves randomly. The potential of a hybrid strategy is that it gives every assignment a winning chance even if that chance is small. Unfortunately, such strategies are always worse than the purely deterministic one for this problem.

The lonesome king

This Riddler puzzle is about a random elimination game. Will someone remain at the end, or will everyone be eliminated?

In the first round, every subject simultaneously chooses a random other subject on the green. (It’s possible, of course, that some subjects will be chosen by more than one other subject.) Everybody chosen is eliminated. In each successive round, the subjects who are still in contention simultaneously choose a random remaining subject, and again everybody chosen is eliminated. If there is eventually exactly one subject remaining at the end of a round, he or she wins and heads straight to the castle for fêting. However, it’s also possible that everybody could be eliminated in the last round, in which case nobody wins and the king remains alone. If the kingdom has a population of 56,000 (not including the king), is it more likely that a prince or princess will be crowned or that nobody will win? How does the answer change for a kingdom of arbitrary size?

Here is my solution:
[Show Solution]

The key feature of this sequential elimination game is that the likelihood of each possible outcome only depends on how many people are playing. So we can define the vector $x_n$ to be the probability somebody will end up winning the game, given that $n$ people are currently playing. We have the base cases $x_0=0$ and $x_1=1$, and our goal is to figure out $x_{56000}$.

At every round, some number of people are eliminated. This continues until we have $0$ or $1$ people left. Define $B(m,n)$ to be the number of ways of having $m$ people left for the next round, given that we currently have $n$ people. We will now explain how to count this.

Everyone is eliminated: this is the quantity $B(0,n)$. In order for this to happen, each person must get picked exactly once. The number of ways this can happen is precisely the number of permutations of $\{1,\dots,n\}$. But it’s a bit more complicated, because people cannot eliminate themselves. So we must count the number of permutations with no fixed points. These are called derangements. The total number of derangements satisfies the recursion:
\begin{align}
B(0,0) &= 1, \quad B(0,1) = 0\\
B(0,n+1) &= n B(0,n) + n B(0,n-1)\quad\text{for: }n=1,2,\dots
\end{align}There is also a closed-form formula that holds for $n\ge 1$, given by $B(0,n) = \lfloor \tfrac{n!}{e} + \tfrac{1}{2} \rfloor$.
Partial elimination: suppose we drop from $n$ people down to $m$ people with $m > 0$. This is the quantity $B(m,n)$. In this case, we must count the number of surjections from $\{1,\dots,n\}$ to itself such that (i) there are no fixed points and (ii) precisely $m$ of the elements are left untouched. We can count this recursively as well. Let $F(n,k)$ be the number of surjections from $\{1,\dots,n\}$ to $\{1,\dots,k\}$ with no fixed points. Let’s count how the number of surjections changes if we remove element $n$ from the pre-image. There are two cases: if we are left with a surjection from $\{1,\dots,n-1\}$ to $\{1,\dots,k\}$, i.e. the removed element was a redundant pairing, then it could have linked to any of the $k$ elements from the image. In other words, it contributed $k F(n-1,k)$ surjections. Alternatively, if the removed pairing was a critical link, then we will be left with a surjection from $\{1,\dots,n-1\}$ to a subset of $\{1,\dots,k\}$ with one element removed. The missing link must connect to the removed element, which can happen in $k$ ways ($k$ possible elements missing), so the net contribution is $k F(n-1,k-1)$. Consequently, we have the recursion: $F(n,k) = kF(n-1,k) + kF(n-1,k-1)$. Since we want to count all possible surjections, we can relate $F$ to $B$ via the formula: $B(m,n) = {n \choose m} F(n,n-m)$. Substituting and simplifying, we obtain a recursion only in terms of $B$’s:
\begin{align}
B(m,n) = \frac{n(n-m)}{m} B(m-1,n-1) + n B(m,n-1)
\end{align}

Let $A(m,n)$ be the probability that we transition from $n$ people to $m$ people in one move. To convert $B(m,n)$ to $A(m,n)$, we divide by the total number of possible assignments. Each person can pick one of the $(n-1)$ other people, so there are $(n-1)^n$ total choices possible. Therefore, $A(m,n) = (n-1)^{-n} B(m,n)$. Substituting into the formulas above, we can obtain a recursion for the $A$’s:
\begin{align}
A(0,0) &= 1,\quad A(0,1) = 0,\quad A(1,1) = 1,\quad A(n,n) = 0,\quad n=2,3,\dots \\
A(0,n+1) &= \tfrac{(n-1)^n}{n^n} A(0,n) + \tfrac{(n-2)^{n-1}}{n^n} A(0,n-1),\quad n=1,2,\dots\\
A(m,n+1) &= \tfrac{(n-1)^n(n+1)}{n^{n+1}} \left( \tfrac{n-m+1}{m} A(m-1,n) + A(m,n) \right),\quad m=1,2,\dots,n
\end{align}Using these formulas, we can compute any of the transition probabilities we like. Here is what the $A(m,n)$ matrix looks like for $0\le m,n \le 6$:
\[
A = \small\begin{bmatrix}
1.0000& 0& 1.0000& 0.2500& 0.1111& 0.0430& 0.0170\\
0& 1.0000& 0& 0.7500& 0.5926& 0.4102& 0.2458\\
0& 0& 0& 0& 0.2963& 0.4688& 0.5069\\
0& 0& 0& 0& 0& 0.0781& 0.2150\\
0& 0& 0& 0& 0& 0& 0.0154\\
0& 0& 0& 0& 0& 0& 0\\
0& 0& 0& 0& 0& 0& 0
\end{bmatrix}
\]So for example, if we had 4 people playing the game, there would be a 29.63% chance 2 people get eliminated, a 59.26% chance 3 people get eliminated, and a 11.11% chance everyone gets eliminated. Note that the columns of this matrix necessarily sum to 1. We can think of the game as a Markov Chain where $A$ matrix is the transition matrix. If the current distribution of the population is $z$ (i.e. $z_k$ is the probability that we have a population of $k$), then the distribution after one round of the game is found via the matrix multiplication $Az$. We want to know what happens after infinitely many rounds have been played. In other words, we want to find the stationary distribution.

The standard way to find a stationary distribution is to solve the eigenvector equation $x^T A = x^T$. Another way is by simply computing $A^k$ for some sufficiently large $k$. Upon doing this, we can plot the limiting distribution as a function of the initial population:

There is a fascinating oscillatory behavior that occurs as the population gets large. Namely, the probability of having a winner vs. having no winner oscillates about 0.5 with a period that increases exponentially. When plotted on a log-scale as above, the stationary distribution appears to converge to a sinusoid. I couldn’t go all the way up to 56,000 but the pattern that emerged was clear, so I extrapolated out what I couldn’t compute. The best-fit curve I found (empirically) for the probability of having an eventual winner for large $n$ is:
\[
x_n \approx 0.503+0.0265 \sin(14.5 \log_{10}(n)-1.27)
\]Substituting for $n=56000$, we find there is a 47.65% chance somebody wins and a 52.35% chance nobody wins.

Note: The limiting period can be explained as follows. The probability that one person doesn’t get picked is $(1-\frac{1}{n})^{n-1}$, so by linearity of expectation, we can expect to have $n(1-\frac{1}{n})^{n-1}$ people remaining after one round. In the limit $n\to\infty$, this is equal to $n e^{-1}$. Since the population roughly shrinks by a factor of $e$ at each iteration, we can expect the probability distribution to repeat with a period of $\log_{10}(e)$ on our plot. This explains why the number inside the sinusoid in our approximate formula turned out to be $\frac{2\pi}{\log_{10}(e)} \approx 14.5$.

Aside from the observation above, I couldn’t find a complete analytic characterization of the stationary distribution, so if anybody has ideas I’d love to hear them!

Adversarial map coloring

This Riddler problem considers the classical map-coloring problem with an adversarial twist! One player draws countries and the other player colors them.

Allison and Bob decide to play a map-coloring game. Each turn, Allison draws a simple closed curve on a piece of paper, and Bob must then color the interior of the “country” that curve creates with one of his many crayons. If the new country borders any pre-existing countries, Bob must color the new country with a color that is different from the ones he used for the bordering ones.

Allison wins the game when she forces Bob to use a sixth color. If they both play optimally, how many countries will Allison have to draw to win?

Here is my solution:
[Show Solution]

Graph colorings

There is a well-known result called the four color theorem that states that at most four colors are required to color a map such that countries sharing a border are always different colors. So why would six colors ever be required? While it’s true that it only takes four colors to color a given map, you need to see the whole map in order to know which colors to use where. In this problem, Allison can (and must) change the way she draws the countries based on how Bob chooses to color them.

Maps can get messy pretty quickly, but it turns out we can work with graphs instead. Each country is represented by a vertex, and we draw an edge between two vertices whenever those countries share a border. Every map can be represented by a planar graph, i.e. a graph that can be drawn such that none of its edges intersect. Likewise, every planar graph corresponds to a map. Here is an example illustrating the transformation.

Every map coloring therefore corresponds to a coloring of the graph vertices such that whenever vertices share an edge, they cannot share the same color. This is known as a graph coloring.

Sequential country addition

The game involves drawing new countries sequentially and then coloring them. From the graph point of view, this amounts to Allison drawing a new vertex along with some number of edges such that the new graph is still planar, and then Bob picking a color for that vertex. The planar graph divides the plane into regions separated by the existing edges: several inner regions and one outer region.

Solution

The smallest map I came up with has eight countries (in the worst case). In the figure below, I show which countries Allison should add at each stage (gray nodes) and what Bob’s color options are. In a couple cases, I pruned the graph to remove interior vertices just to simplify the picture, and I ignored equivalent positions resulting from permuting colors since the color itself doesn’t matter, only whether colors are the same or different. Allison starts by drawing three touching countries.

If Bob does his best to delay the inevitable (he follows the longest path on the diagram above), then there will be a total of 8 countries on the map when Bob is forced to use a sixth color. Here is what one of the final largest maps looks like with all the countries drawn in when both players are playing optimally. I numbered the regions in the order in which Allison draws them, and colored them as Bob would color them.

The eigth country forces Bob to choose the sixth color. As mentioned above, every map can be colored using only four colors, so in hindsight, Bob could have recolored this same map using only four colors. Here is one possible way to do that:

Of course, there is no hindsight in this game. No matter how Bob colors the map, Allison will simply adjust how she draws subsequent countries in such a way to force a sixth color to appear.

Open problem and conjecture

The natural question to ask is whether Allison can ever force Bob to use 7 colors, or 8 colors, or arbitrarily many colors. I suspect the answer is yes, because if we repeatedly add nodes with only two neighbors, we can make the graph’s boundary as large as we like. We can also ensure all four colors are used as often as we like by occasionally grouping triples (e.g. group 1,2,3 to force a 4). We can also group 1,2,3,4 to force a 5 and by adding enough nodes, we can include as many 5’s as we like in the boundary as well. This process continues and ultimately we can force the appearance of a 6, then arbitrarily many 6’s, then a 7, then arbitrarily many 7’s, and so on. The question of finding the minimum number of countries required to force a given number of colors seems difficult, but I suspect it grows exponentially.