This week’s Fiddler is about hopping back and forth.
You are a frog in a pond with an infinite number of lily pads in a line, marked “1,” “2,” “3,” etc. You are currently on pad 2, and your goal is to make it to pad 1. From any given pad, there are specific probabilities that you’ll jump to another pad: Whenever you are on pad $k$, you will hop to pad $k−1$ with probability $1/k$, and you will hop to pad $k+1$ with probability $(k−1)/k$.
What is the probability that you will ultimately make it to pad 1?
My solution:
[Show Solution]
The general finite case
We will first solve the most general form of this problem: arbitrary probabilities and an arbitrary starting point! Suppose there are $n$ pads numbered $1,2,\dots,n$. Let $a_k$ be the probability that we will make it to pad 1 starting from pad $k$. Further assume that when we’re at pad $k$, we transition to $k-1$ with probability $p_k$ and to $k+1$ with probability $q_k$. We have $p_k+q_k=1$ so these are the only two possibilities. This leads to the recurrence relation:
\begin{align*}
a_1 &= 1 \\
a_k &= p_k a_{k-1} + q_k a_{k+1}\qquad\text{for }k=2,3,\dots,n-1\\
a_n &= 0
\end{align*} Start by rewriting the recurrence relation using the fact that $p_k+q_k=1$:
\begin{align*}
a_1 &= 1 \\
p_k (a_{k}-a_{k-1}) &= q_k(a_{k+1}-a_k) \qquad \text{for }k=2,3,\dots,n-1\\
a_n &= 0
\end{align*}Iterating the recurrence starting from $k-1$, we obtain
\begin{align*}
(a_k-a_{k-1}) &= \frac{p_{k-1}}{q_{k-1}}(a_{k-1}-a_{k-2}) \\
&= \frac{p_{k-1}}{q_{k-1}} \cdot \frac{p_{k-2}}{q_{k-2}} (a_{k-2}-a_{k-3}) \\
&\;\;\vdots \\
&= (a_2-a_1)\prod_{i=2}^{k-1} \frac{p_{i}}{q_{i}}
\end{align*}Now sum from $k=2$ to $k=n$ and the left-hand side will telescope.
\[
a_n-a_1 = \sum_{k=2}^n (a_k-a_{k-1}) = (a_2-a_1)\sum_{k=2}^n\prod_{i=2}^{k-1} \frac{p_{i}}{q_{i}}
\]Letting $a_1=1$ and $a_n=0$, we can solve for $a_2$ and obtain:
\[
a_2 = 1-\frac{1}{\sum_{k=2}^n \prod_{i=2}^{k-1} \frac{p_i}{q_i}}
\]Now instead of summing up to $k=n$ in the telescoping sum, sum to $k=m$ to find a general term, and obtain:
\[
a_m-a_1 = (a_2-a_1)\sum_{k=2}^m\prod_{i=2}^{k-1} \frac{p_{i}}{q_{i}}
\]Substituting the values we found for $a_1$ and $a_2$, we obtain the following general formula for any $a_m$:
$\displaystyle
a_m = 1-\frac{\sum_{k=2}^m \prod_{i=2}^{k-1} \frac{p_i}{q_i}}{\sum_{k=2}^n \prod_{i=2}^{k-1} \frac{p_i}{q_i}},\qquad \text{for }m=1,2,\dots,n
$
The formula even works at the boundaries. When $m=1$, the numerator is zero (empty sum), and we recover $a_1=1$. When $m=n$, the numerator matches the denominator and we recover $a_n=0$.
This problem can also be viewed as that of finding the stationary distribution of a Markov chain. However, this approach leads to a formula that is more difficult to simplify involving the inverse of an $n\times n$ matrix, which is why I opted for the approach above.
The general infinite case
We now consider the case where there are infinitely many lily pads. The recurrence relation now looks like:
\begin{align*}
a_1 &= 1 \\
a_k &= p_k a_{k-1} + q_k a_{k+1}\qquad\text{for }k=2,3,\dots
\end{align*}Such recurrence relations are also called one-dimensional random walks.
This case is a bit trickier than the finite case. We are dealing with a second-order difference equation, so two boundary conditions are required to determine a unique solution. This was no problem in the finite case we previously solved, because we had the boundary conditions $a_1=1$ and $a_n=0$. But in this infinite case, we only have the condition $a_1=1$. This means there will be infinitely many possible solutions, each with a different value at infinity; the value of $a_\infty = \lim_{k\to\infty}a_k$. In order to have a unique solution, we must use the correct “boundary condition at infinity”.
To illustrate this fact, observe that we can satisfy the recurrence relation by setting $a_k=1$ for all $k$. This scenario has the boundary condition $a_\infty=1$ and is known as gambler’s ruin (the pads are viewed as amounts of money, and each turn we bet and either gain more money or lose money). It’s called “gambler’s ruin” because we always eventually go broke (return to pad 1). One example of gambler’s ruin is the case $p_k=q_k=\frac{1}{2}$; i.e., we flip a fair coin at every pad to see if we move forward or backward.
In cases where $q_k \gt p_k$ (we are likelier to move to a larger pad than to a smaller one), there is a non-zero probability that we will never return to pad 1. This cases is characterized by the property that the farther we get from pad 1, the less likely we are to return. In other words, our boundary condition at infinity should be $a_\infty = 0$.
To solve this case, we can use the exact same approach as in the finite case, except instead of summing up to $k=n$ and using $a_n=0$ to find $a_2$, we sum up to $k=\infty$ and use $a_\infty=0$ to find $a_2$. This results in the similar-looking formula:
$\displaystyle
a_m = 1-\frac{\sum_{k=2}^m \prod_{i=2}^{k-1} \frac{p_i}{q_i}}{\sum_{k=2}^{\infty} \prod_{i=2}^{k-1} \frac{p_i}{q_i}},\qquad \text{for }m=1,2,\dots
$
Note: To obtain the set of all possible solutions to the recurrence relation, we need only take affine combinations of the two fundamental solutions (the all-ones gambler’s ruin solution and the solution above). In other words, the general solution is given by:
\[
\begin{bmatrix}
a_1^\text{gen} \\ a_2^\text{gen} \\ \vdots
\end{bmatrix}
=
\alpha
\begin{bmatrix}
1 \\ 1 \\ \vdots
\end{bmatrix}
+ (1-\alpha)
\begin{bmatrix}
a_1 \\ a_2 \\ \vdots
\end{bmatrix}
\]where $a_1,a_2,\dots$ is the solution we derived above. This general solution satisfies $\lim_{k\to\infty}a_k = \alpha$, so based on how we choose $\alpha\in\mathbb{R}$, we can achieve any desired boundary condition at infinity.
Solving our special case
In our special case of interest, we have $p_i=\frac{1}{i}$ and $q_i=\frac{i-1}{i}$. Substituting into our formula, we obtain:
\begin{align*}
a_m &= 1-\frac{\sum_{k=2}^m \prod_{i=2}^{k-1} \frac{1}{i-1}}{\sum_{k=2}^n \prod_{i=2}^{k-1} \frac{1}{i-1}}
= 1-\frac{\sum_{k=2}^m \frac{1}{(k-2)!}}{\sum_{k=2}^n \frac{1}{(k-2)!}}
= 1-\frac{\sum_{k=0}^{m-2} \frac{1}{k!}}{\sum_{k=0}^{n-2} \frac{1}{k!}}
\end{align*}We can also write this as a single fraction:
\[
a_m = \frac{\frac{1}{(m-1)!} + \cdots + \frac{1}{(n-2)!}}{\frac{1}{0!}+\frac{1}{1!}+\frac{1}{2!}+\cdots+\frac{1}{(n-2)!}}\qquad\text{for }m=1,2,\dots,n
\]
Example 1: If we have $n=4$ pads and we start on pad $m=2$, then the probability we eventually end up on pad 1 is given by:
\[
a_2 = \frac{\frac{1}{1!}+\frac{1}{2!}}{\frac{1}{0!}+\frac{1}{1!}+\frac{1}{2!}}
=\frac{3}{5} = 60\%.
\]
Example 2: If we have $n=\infty$, we recognize the sum in the denominator as the infinite series representation for Euler’s constant $e$, and the sum in the numerator is one less. Therefore,
\[
a_2 = \frac{\frac{1}{1!}+\frac{1}{2!}+\cdots}{\frac{1}{0!}+\frac{1}{1!}+\frac{1}{2!}+\cdots}
=\frac{e-1}{e} \approx 63.212\%.
\]
If instead of starting on pad 2 we start on a larger-numbered pad, the numerator will continue to lose terms in its sum and gradually degrade to zero the farther we start.
Beautiful analysis! I got as far as the recursion relation, but not to your elegant solution of it.
I am confused, though, by your comment on the Markov chain. It seems to me that your analysis is completely equivalent to finding the stationary distribution. Your a_k’s are the components of the right-eigenvector with eigenvalue 1 of the Markov transition matrix (there are two of these right-eigenvectors, your two ‘fundamental solutions’). The corresponding left-eigenvectors span the possible stationary distributions, and inner products with the initial state determine the correct one. No matrix inversion is needed. Of course the final result is the same as yours.