\n
Let\u2019s call Player A\u2019s number $x \\in [0,1]$ and Player B\u2019s number $y \\in [0,1]$. We\u2019ll assume a general mixed strategy<\/a> for each player and compute each player\u2019s best response. This approach is similar to the one I used in the war game puzzle<\/a>, but the solution is more complicated this time.<\/p>\n
For this solution, I\u2019ll use similar notation and conventions to my solution to the baby poker<\/a> (the discrete version of toddler poker). Define players\u2019 strategies as follows:<\/p>\n
\n
$p(x)$: probability that Player A will raise<\/em> if their number is $x$.\n<\/li>\n
$q(y)$: probability that Player B will fold<\/em> if their number is $y$.\n<\/li>\n<\/ul>\n
Let\u2019s call $E(x,y)$ the payoff for Player A when both numbers are revealed:
\n\\[
\nE(x,y) = \\begin{cases}
\n1&\\text{if }x > y \\\\
\n-1&\\text{if }x < y
\n\\end{cases}
\n\\]We don\u2019t consider the case $x=y$ because that case has a zero probability of occurring. If we let $W(x,y)$ be the winnings for Player A, we can compute this quantity as we did for the discrete problem:
\n\\[
\nW(x,y) = (1-p(x))E(x,y) + p(x)\\bigl( k(1-q(y))E(x,y) + q(y) \\bigr)
\n\\]Of course, $A$\u2019s expected winnings averaged over all random numbers $x,y$ is simply the integral $\\bar W = \\int_0^1\\int_0^1 W(x,y)\\,dx\\,dy$.<\/p>\n
Player B\u2019s best response<\/h3>\n
Let\u2019s suppose that Player A uses strategy $p(x)$ and Player B somehow knows this in advance and gets to play the best possible response. What should this response be? For each $y$, $q(y)$ should be chosen to minimize A\u2019s expected winnings. In other words, we should solve:
\n\\[
\nq(y) = \\arg\\underset{q}{\\min} \\int_0^1 \\biggl[
\n(1-p(x))E(x,y) + p(x)\\bigl( k(1-q)E(x,y) + q \\bigr) \\biggr]\\,dx
\n\\]The expression on the right is linear in $q$ and the constant terms don\u2019t affect the argmin. So we conclude that
\n\\[
\nq(y) = \\begin{cases}
\n1 & \\text{if } \\int_0^1 p(x)(1-kE(x,y))\\,dx < 0 \\\\
\n0 & \\text{otherwise}
\n\\end{cases}
\n\\]Splitting the integral for $x\\in[0,1]$ into $x\\in[0,y]$ and $x\\in[y,1]$, we can substitute the definition of $E(x,y)$ and obtain:
\n\\begin{align}
\n\\int_0^1 p(x)(1-kE(x,y))\\,dx
\n&= \\int_0^1 p(x)dx + k\\left( \\int_0^y p(x) dx \u2013 \\int_y^1 p(x) dx \\right) \\\\
\n&= (1-k)\\int_0^1 p(x)dx + 2k \\int_0^y p(x)dx
\n\\end{align}So our final formula for $q(y)$ is:<\/p>\n
$\\displaystyle
\nq(y) = \\begin{cases}
\n1 & \\text{if } \\int_0^y p(x)dx < \\frac{k-1}{2k}\\int_0^1 p(x)dx \\\\
\n0 & \\text{otherwise}
\n\\end{cases}
\n$<\/span><\/p>\n
This formula already tells us a lot. If $k \\le 1$, the inequality never holds so $q(y)=0$ (always call). If $k > 1$, then $0 < \\frac{k-1}{2k} < \\tfrac{1}{2}$. Since $\\int_0^y p(x)dx$ is a monotonically increasing function no matter what $p$ is, there is a unique $y$ that yields equality. We deduce that $q(y)$ must be a threshold strategy:
\n\\[
\nq(y) = \\begin{cases}
\n1 & \\text{if } 0 \\le y < c \\\\
\n0 & \\text{if } c < y \\le 1
\n\\end{cases}
\n\\]where $c$ is chosen such that $\\int_0^c p(x)dx = \\frac{k-1}{2k}\\int_0^1 p(x)dx$. So fold if your hand is bad, and call if your hand is good. Makes sense!<\/p>\n
Player A\u2019s best response<\/h3>\n
Let\u2019s suppose that Player B uses strategy $q(y)$ and Player A somehow knows this in advance and gets to play the best possible response. What should this response be? For each $x$, $p(x)$ should be chosen to maximize A\u2019s expected winnings. In other words, we should solve:
\n\\[
\np(x) = \\arg\\underset{p}{\\max} \\int_0^1 \\biggl[
\n(1-p)E(x,y) + p\\bigl( k(1-q(y))E(x,y) + q(y) \\bigr) \\biggr]\\,dy
\n\\]The expression on the right is linear in $p$ and the constant terms don\u2019t affect the argmin. So we conclude that
\n\\[
\np(x) = \\begin{cases}
\n1 & \\text{if } \\int_0^1 \\bigl( -E(x,y) + k(1-q(y))E(x,y) + q(y) \\bigr) \\,dy > 0 \\\\
\n0 & \\text{otherwise}
\n\\end{cases}
\n\\]Splitting the integral up as $[0,1] = [0,x] \\cup [x,1]$ as we did when we computed Player B\u2019s best response and simplifying the algebra, we obtain a more complicated formula than last time:<\/p>\n
$\\displaystyle
\np(x) = \\begin{cases}
\n1 & \\text{if }\\,\\, \\frac{k-1}{k}(\\tfrac{1}{2}-x) + \\int_0^x q(y)dy < \\frac{k+1}{2k}\\int_0^1 q(y)dy \\\\
\n0 & \\text{otherwise}
\n\\end{cases}
\n$<\/span><\/p>\n
This is a bit trickier than last time because the left-hand side of the inequality isn\u2019t a simple increasing function in $x$. It contains both an increasing and a decreasing part! So $A$\u2019s best response might be more complicated than a simple threshold strategy. However, we can leverage the fact that we have a formula for $q(y)$\u2026<\/p>\n
Combining both best responses<\/h3>\n
Substituting Player B\u2019s threshold response into the formula for Player A\u2019s best response, we obtain:
\n\\[
\np(x) = \\begin{cases}
\n1 & \\text{if }\\,\\, \\frac{k-1}{k}(\\tfrac{1}{2}-x) + \\min(x,c) < \\frac{k+1}{2k}c \\\\
\n0 & \\text{otherwise}
\n\\end{cases}
\n\\]Working out the cases $ x < c $ and $ x > c $ separately, we deduce that:
\n\\[
\np(x) = \\begin{cases}
\n1 & \\text{if }\\,\\, 0 < x < \\frac{k+1}{2}c-\\frac{k-1}{2} \\\\
\n0 & \\text{if }\\,\\, \\frac{k+1}{2}c-\\frac{k-1}{2} < x < \\frac{c+1}{2} \\\\
\n1 & \\text{if }\\,\\, \\frac{c+1}{2} < x < 1
\n\\end{cases}
\n\\]So Player A still plays a threshold strategy\u2026 but with two thresholds rather than one! We can now solve for $c$ by substituting $p(x)$ back into the formula $\\int_0^c p(x)dx = \\frac{k-1}{2k}\\int_0^1 p(x)dx$ we derived earlier. This is relatively easy to do because $c$ is always in the middle portion of the interval. i.e. $p(c)=0$. The result is:
\n\\[
\n\\left( \\tfrac{k+1}{2}c-\\tfrac{k-1}{2} \\right) = \\tfrac{k-1}{2k}\\left[ \\left(\\tfrac{k+1}{2}c-\\tfrac{k-1}{2}\\right) + \\left(1-\\tfrac{c+1}{2}\\right)\\right]
\n\\]After simplifications, we obtain:
\n\\[
\nc = \\frac{(k-1)(k+2)}{k(k+3)}
\n\\]We can go back and compute the expected winnings of Player A by integrating $W(x,y)$ using the optimal policies we derived. Upon doing this, we find that the expected winnings for Player A are:
\n\\[
\n\\bar W = \\frac{k-1}{k(k+3)}
\n\\]\n<\/p><\/div>\n
If you\u2019d like the tl;dr instead:
\n [Show Solution]<\/a><\/p>\n
\n
Optimal policies<\/h3>\n
The optimal policy for Player A is:
\n\\[
\n\\text{Player A: } \\begin{cases}
\n\\text{raise} & \\text{if } 0 < x < \\frac{k-1}{k(k+3)} \\\\
\n\\text{call} & \\text{if } \\frac{k-1}{k(k+3)} < x < \\frac{k^2+2k-1}{k(k+3)} \\\\
\n\\text{raise} & \\text{if } \\frac{k^2+2k-1}{k(k+3)} < x < 1
\n\\end{cases}
\n\\]The optimal policy for Player B is:
\n\\[
\n\\text{Player B: } \\begin{cases}
\n\\text{fold} & \\text{if } 0 < y < \\frac{(k-1)(k+2)}{k(k+3)} \\\\
\n\\text{call} & \\text{if } \\frac{(k-1)(k+2)}{k(k+3)} < y < 1
\n\\end{cases}
\n\\]The expected payout for Player A is given by the expression:
\n\\[
\n\\text{Expected payout for Player A:}\\quad \\frac{k-1}{k(k+3)}\\quad\\text{dollars}.
\n\\]Here are plots that show the optimal strategies:<\/p>\n
$\"\"$ <\/p>\n
$\"\"$ <\/p>\n
For the case $k=2$, Player A should raise if $x>0.7$ or if $x<0.1$ (a bluff). Meanwhile, Player B should call if $y > 0.4$ and fold otherwise. On average, Player A wins \\$0.10 per game. A fascinating twist is that as $k$ increases, Player A will bluff more aggressively at first, but then will eventually not bluff at all.<\/p>\n
The case $k=3$ is special; it corresponds to when Player A is most aggressive (bluffing happens whenever $x < \\tfrac{1}{9} \\approx 0.111$). This also coincides to when the game is most advantageous to Player A; the expected winnings are also \\$0.111. Put another way, if Player A gets to choose how much<\/em> the raise should be, they should choose \\$3! When $k>3$ and as $k$ gets larger, Player A becomes increasingly conservative; raising only very rarely (when a win is all but assured). In this limit, the expected payout for Player A decreases monotonically and converges to \\$0 in the limit.<\/p>\n<\/div>\n
<\/p>\n<\/body>","protected":false},"excerpt":{"rendered":"
In a previous post, I took a look at \u201cbaby poker\u201d, a game involving two players rolling a six-sided die. The higher number wins, but players may elect to raise, call, or fold depending on their number (which only they can see). In this post, I\u2019ll take a look at the continuous version of the … Continue reading “Toddler poker”<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":1758,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"_uf_show_specific_survey":0,"_uf_disable_surveys":false,"footnotes":""},"categories":[7],"tags":[33,8,2],"class_list":["post-1749","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-riddler","tag-game-theory","tag-probability","tag-riddler"],"aioseo_notices":[],"aioseo_head":"\n\t\t\n\t\n\t\n\t\n\t\n\t\n\t\t\n\t\t\n\t\t\n\t\t\n\t\t\n\t\t\n\t\t\n\t\t\n\t\t\n\t\t\n\t\t\n\t\t