This week’s Riddler Classic is a probability question inspired by the ongoing World Cup.
The Riddler Football Playoff (RFP) consists of four teams. Each team is assigned a random real number between 0 and 1, representing the “quality” of the team. If team $A$ has quality $a$ and team $B$ has quality $b$, then the probability that team $A$ will defeat team $B$ in a game is $\frac{a}{a+b}$.
In the semifinal games of the playoff, the team with the highest quality (the “1 seed”) plays the team with the lowest quality (the “4 seed”), while the other two teams play each other as well. The two teams that win their respective semifinal games then play each other in the final.
On average, what is the quality of the RFP champion?
My solution:
[Show Solution]
Suppose $a \gt b \gt c \gt d$. Consider the event that $A$ is the champion. In order for this to happen, $A$ must first defeat $D$ in the semifinals. Then, either $B$ wins their semifinal match and $A$ defeats $B$ in the finals, or $C$ wins their semifinal match and $A$ defeats $C$ in the finals. We can use a similar argument to compute the probability that $B$, $C$, or $D$ wins, and we obtain the following formulas:
\begin{align}
P_A &= \frac{a}{a+d}\left(\frac{b}{b+c}\cdot\frac{a}{a+b}+\frac{c}{b+c}\cdot\frac{a}{a+c}\right) \\
P_B &= \frac{b}{b+c}\left(\frac{a}{a+d}\cdot\frac{b}{a+b}+\frac{d}{a+d}\cdot\frac{b}{b+d}\right) \\
P_C &= \frac{c}{b+c}\left(\frac{a}{a+d}\cdot\frac{c}{a+c}+\frac{d}{a+d}\cdot\frac{c}{c+d}\right) \\
P_D &= \frac{d}{a+d}\left(\frac{b}{b+c}\cdot\frac{d}{b+d}+\frac{c}{b+c}\cdot\frac{d}{c+d}\right)
\end{align}One can check that $P_A+P_B+P_C+P_D=1$ for all values of $(a,b,c,d)$, since exactly one of these four outcomes must occur. The expected value of the quality of the champion is therefore
\[
x(a,b,c,d) = a P_A + b P_B + c P_C + d P_D.
\]Ultimately, we seek the average of $x(a,b,c,d)$ where each $a,b,c,d$ is chosen uniformly at random in $[0,1]$. The formulas above require $a\gt b \gt c \gt d$, but by symmetry each of the $4!=24$ permutations of $a,b,c,d$ is equally likely to occur. Therefore, we can restrict our averaging to the case where $a\gt b \gt c \gt d$ and multiply the result by $24$. This leads to the following formula for $\bar x$, the average value of $x$.
\[
\bar x = 24 \int_{a=0}^1 \int_{b=0}^a \int_{c=0}^b \int_{d=0}^c x(a,b,c,d)\,
\mathrm{d}d\,\mathrm{d}c\,\mathrm{d}b\,\mathrm{d}a
\]This integral is very difficult to evaluate by hand, so I used Mathematica, and found the result to be:
\begin{align}
\bar x
&= \frac{2}{5} \left(23-(29+2\pi^2)\log(2)+39\log^2(2)-8\log^3(2)-3\zeta(3)\right)\\
&\approx 0.6735417813867959221089
\end{align}Here, $\zeta$ is the Riemann zeta function. The constant $\zeta(3)=\sum_{n=1}^\infty\frac{1}{n^3}\approx 1.2020569$ is known as Apéry’s constant. The reason this constant shows up is because we are repeatedly integrating products of rational functions, and it turns out that
\[
\zeta(3) = \int_0^1\int_0^1\int_0^1\frac{1}{1-xyz}\,\mathrm{d}x\,\mathrm{d}y\,\mathrm{d}z
\]Fun fact: it is known that $\zeta(3)$ is irrational, but it is still not known whether it is transcendental!
Numerical verification
I also obtained an estimate for $\bar x$ by performing a Monte Carlo simulation of 10 million games of Riddler Football Playoff in Matlab. Here is my code:
N = 1e7;
quality = zeros(N,1);
for trial = 1:N
x = sort(rand(4,1));
a = x(4); b = x(3); c = x(2); d = x(1);
% F = winner of Semifinal 1 (A vs D)
if rand < a/(a+d)
f = a;
else
f = d;
end
% G = winner of Semifinal 2 (B vs C)
if rand < b/(b+c)
g = b;
else
g = c;
end
% X = winner of Finals (F vs G)
if rand < f/(f+g)
quality(trial) = f;
else
quality(trial) = g;
end
end
avg = mean(quality)
sem = std(quality)/sqrt(N);
conf_interval = [avg-1.96*sem,avg+1.96*sem]
Running the code above took about 10 seconds and produced an estimate of $0.67353$, with a 95% confidence interval $[0.67339, 0.67367]$. So it looks like there is good agreement between the numerical value derived from the exact formula and the empirical Monte Carlo estimate.