This week’s Riddler Classic is a strategy game about maximizing payout. What is the optimal strategy?

You start with just the number 1 written on a slip of paper in a hat. Initially, there are no other slips of paper in the hat. You will draw from the hat 100 times, and each time you draw, you have a choice: If the number on the slip of paper you draw is k, then you can either receive k dollars or add k higher numbers to the hat.

For example, if the hat were to contain slips with the numbers 1 through 6 and you drew a 4, you could either receive $4 or receive no money but add four more slips numbered 7, 8, 9 and 10 into the hat. In either case, the slip with the number 4 would then be returned to the hat.

If you play this game perfectly — that is, to maximize the total amount of money you’ll receive after all 100 rounds — how much money would you expect to receive on average?

My solution:

[Show Solution]

When we are playing the game, there are two relevant variables: (1) the number of slips of paper in the hat, and (2) the number of draws remaining. We will therefore define the following quantity:

\[

w(n,t) = \left\{

\begin{array}{l}

\text{Amount we can expect to win from this point} \\

\text{forward, if there are }n\text{ slips of paper in the hat} \\

\text{and we have }t\text{ draws remaining, assuming} \\

\text{optimal play for the rest of the game.}

\end{array}

\right\}

\]The problem is asking us to calculate $w(1,100)$. We will do this by calculating in order: $w(n,0)$, $w(n,1)$, and so on, using a process called

dynamic programming. The idea is that the optimal strategy when there are $t+1$ draws remaining depends on the optimal strategy when there are $t$ draws remaining. So if we work our way backwards from $t=0$, we can derive the strategy for any $t$.

- When $t=0$, the game is over, so we have $w(n,0)=0$ for all $n$.
- When $t=1$, we are on our last turn, so there is no benefit to adding more slips of paper to the hat. In this case, we should always take the money. How much money will this be, on average? There are $n$ slips of paper, so we can expect to win $w(n,1) = \frac{1}{n}\left(1+2+\cdots+n\right) = \frac{n+1}{2}$. Therefore,

\[

w(n,1) = \frac{n+1}{2}.

\]
- When $t=2$, we have to think more carefully. Suppose we draw the slip numbered $k$ from the hat. If we take the money, then we will pocket $k$ dollars and then earn on average $w(n,1)$ next turn. If instead we add more slips of paper to the hat, we will pocket nothing, but earn on average $w(n+k,1)$ next turn. Comparing these two quantities:

\[

k + w(n,1) = k + \frac{n+1}{2}\qquad\text{is larger than}\qquad

w(n+k,1) = \frac{n+k+1}{2}

\]Therefore, it is always in our best interest to take the money when there are two turns remaining. When we do this, we will, on average, earn

\[

w(n,2) = \frac{1}{n}\sum_{k=1}^n \left( k + \frac{n+1}{2} \right) = n+1.

\]
- The step $t=3$ is similar to $t=2$. We need to decide whether $k+w(n,2)$ is larger or smaller than $w(n+k,2)$. In this case, we have

\[

k+w(n,2) = n+k+1\qquad\text{is equal to}\qquad w(n+k,2) = n+k+1

\]So both choices earn the same on average. In either case, we obtain

\[

w(n,3) = \frac{1}{n}\sum_{k=1}^n \left( n+k+1 \right) = \frac{3}{2}(n+1)

\]
- When $t = 4$, it is always best to add more slips of paper. This is because

\[

k+w(n,3) = k+\frac{3}{2}(n+1)\qquad\text{is smaller than}\qquad w(n+k,3)=\frac{3}{2}(n+k+1)

\]And when choosing the larger one, we obtain

\[

w(n,4) = \frac{1}{n}\sum_{k=1}^n \frac{3}{2}(n+k+1) = \frac{9}{4}(n+1).

\]

It turns out we can continue in this fashion and prove that whenever $n\geq 4$, it is always better to add more slips of paper. We can show this using induction. Based on the pattern we’ve observed so far, we observe that $w(n,t) = \left(\frac{3}{2}\right)^{t-2}(n+1)$ for $t=2,3,4$. To show that the pattern continues for larger $t$, suppose it holds for a given $t\geq 3$. To compute $t+1$, we notice that

\[

k+w(n,t) = k+\left(\frac{3}{2}\right)^{t-2}(n+1)\quad\text{is smaller than}\quad w(n+k,t) = \left(\frac{3}{2}\right)^{t-2}(n+k+1)

\]Therefore, we should add more slips of paper, and we obtain

\begin{align}

w(n,t+1) &= \frac{1}{n}\sum_{k=1}^n \left(\frac{3}{2}\right)^{t-2}(n+k+1) \\

&= \frac{1}{n}\left(\frac{3}{2}\right)^{t-2}\sum_{k=1}^n (n+k+1) \\

&= \left(\frac{3}{2}\right)^{t-1}(n+1)

\end{align}which verifies that our pattern continues for $t+1$, and by induction, it continues for all larger $t$. We therefore have the formula:

$\displaystyle

w(n,t) = \begin{cases}

\left(\frac{3}{2}\right)^{t-2}(n+1) & \text{for }t\geq 2 \\

\frac{n+1}{2} & \text{for }t=1

\end{cases}

$

Returning to the original question, we can compute $w(1,100)$:

\[

w(1,100) = \left(\frac{3}{2}\right)^{98}(1+1) = \frac{3^{98}}{2^{97}}

\]So if we play optimally, we can expect to win approximately \$361,387,713,364,635,766.57, or about 361 quadrillion dollars!

Of course, this is only an *average*. In general, there will be a very large variability in how much we actually win! Here is a plot of the distribution of winnings (one million trials) along with the average shown as a vertical line. The distribution is heavily skewed, so I plotted the winnings on a log scale. This skewness is manifested in the fact that although the average is about $3.6\times 10^{17}$, the median is only $5.3 \times 10^{16}$. So on average we win a lot, but *most of the time* we win about 7 times less.

You can make a pretty good estimate of the width and central value of the probability distribution in log(n) which you find.

Note that, except for the first few draws, n is generally large. In the approximation that it’s large, the k-value which is drawn varies continuously from 0 to n, or (k/n) ranges from 0 to 1.

The increase in ln(n) is ln(1+k/n). The mean and variance are: (LaTeX notation)

$$

\langle \Delta \ln(n) \rangle = \int_1^2 \ln(y) dy = 2 \ln(2) – 1

$$

and

$$

\Delta \sigma^2 = \int_1^2 ( \ln(y) + 1 – 2 \ln(2) )^2 dy

= 1 – 2 (\ln(2))^2

$$

which are approximately .3863 and .0391 respectively.

The distribution of $\ln(n)$ is approximately Gaussian by the central limit theorem,

with a squared width of about 98*.0391.

Actually one should be a little more careful —

the final two draws have a fairly large variance as well, which should be computed and included separately.