{"id":1749,"date":"2017-02-13T14:00:27","date_gmt":"2017-02-13T20:00:27","guid":{"rendered":"http:\/\/www.laurentlessard.com\/bookproofs\/?p=1749"},"modified":"2017-05-08T02:41:57","modified_gmt":"2017-05-08T07:41:57","slug":"toddler-poker","status":"publish","type":"post","link":"https:\/\/laurentlessard.com\/bookproofs\/toddler-poker\/","title":{"rendered":"Toddler poker"},"content":{"rendered":"<p>In a previous <a href=\"https:\/\/laurentlessard.com\/bookproofs\/baby-poker\/\">post<\/a>, I took a look at &#8220;baby poker&#8221;, a game involving two players rolling a six-sided die. The higher number wins, but players may elect to raise, call, or fold depending on their number (which only they can see). In this post, I&#8217;ll take a look at the <em>continuous<\/em> version of the problem (also appeared in a recent <a href=\"https:\/\/fivethirtyeight.com\/features\/can-you-save-the-drowning-swimmer\/\">Riddler post<\/a>!) Here is the full text of the problem:<\/p>\n<blockquote><p>\nToddler poker is played by two players. Each is dealt a &#8220;card,&#8221; which is actually a number randomly chosen uniformly from the interval [0,1]. (It could be 0.1, or 0.9234781, or 1\/\u03c0, and so on.) The game starts with each player anteing \\$1. Player A can then either &#8220;call,&#8221; in which case both numbers are shown and the player with the higher number wins the \\$2 on the table, or &#8220;raise,&#8221; betting one more dollar. If A raises, B then has the option to either &#8220;call&#8221; by matching A\u2019s second dollar, after which the higher number wins the \\$4 on the table, or &#8220;fold,&#8221; in which case A wins but B is out only his original \\$1. No other plays are made.<\/p>\n<p>What is the optimal strategy for each player? Under those strategies, how much is a game of toddler poker worth to Player A?<\/p>\n<p>Extra credit: What if the value of the raise is \\$k \u2014 i.e., players stand to profit \\$k instead of \\$2 after the raise?\n<\/p><\/blockquote>\n<p><!--\n\n\n<ul>\n\n\n<li> Player A and Player B are each independently assigned a random number uniformly distributed between 0 and 1. Each player gets to see their number but not the other player's number.\n\n\n<li> Both players ante \\$1. Player A goes first and may <em>call<\/em> or <em>raise<\/em>. If Player A calls, both numbers are revealed, the player with the higher number wins the pot, and the game ends.\n\n\n<li> If Player A raises, then Player B can respond by <em>calling<\/em> or <em>folding<\/em>. If Player B folds in response to Player A's raise, then the game ends and Player A wins the \\$2 in the pot.\n\n\n<li> If Player B calls in response to Player A's raise, both players ante an additional $k-1$ dollars, both numbers are revealed, the player with the higher number wins the larger pot, and the game ends.\n<\/ul>\n\n\nWhat is the optimal strategy for each player? Under those strategies, how much is a game of baby poker worth to Player A? In other words, how much should A pay B beforehand to make it a fair game? Finally, how does the answer change as we vary $k$, the amount of the raise?\n--><\/p>\n<p>Here is my derivation:<br \/>\n<a href=\"javascript:Solution('soln_baby_poker_revisited','toggle_baby_poker_revisited')\" id=\"toggle_baby_poker_revisited\">[Show Solution]<\/a><\/p>\n<div id=\"soln_baby_poker_revisited\" style=\"display: none\">\n<p>Let&#8217;s call Player A&#8217;s number $x \\in [0,1]$ and Player B&#8217;s number $y \\in [0,1]$. We&#8217;ll assume a general <a href=\"https:\/\/en.wikipedia.org\/wiki\/Strategy_(game_theory)#Mixed_strategy\">mixed strategy<\/a> for each player and compute each player&#8217;s best response. This approach is similar to the one I used in the <a href=\"https:\/\/laurentlessard.com\/bookproofs\/the-war-game\/\">war game puzzle<\/a>, but the solution is more complicated this time.<\/p>\n<p>For this solution, I&#8217;ll use similar notation and conventions to my solution to the <a href=\"https:\/\/laurentlessard.com\/bookproofs\/baby-poker\/\">baby poker<\/a> (the discrete version of toddler poker). Define players&#8217; strategies as follows:<\/p>\n<ul>\n<li> $p(x)$: probability that Player A will <em>raise<\/em> if their number is $x$.\n<li> $q(y)$: probability that Player B will <em>fold<\/em> if their number is $y$.\n<\/ul>\n<p>Let&#8217;s call $E(x,y)$ the payoff for Player A when both numbers are revealed:<br \/>\n\\[<br \/>\nE(x,y) = \\begin{cases}<br \/>\n1&#038;\\text{if }x > y \\\\<br \/>\n-1&#038;\\text{if }x < y\n\\end{cases}\n\\]We don't consider the case $x=y$ because that case has a zero probability of occurring. If we let $W(x,y)$ be the winnings for Player A, we can compute this quantity as we did for the discrete problem:\n\\[\nW(x,y) = (1-p(x))E(x,y) + p(x)\\bigl( k(1-q(y))E(x,y) + q(y) \\bigr)\n\\]Of course, $A$'s expected winnings averaged over all random numbers $x,y$ is simply the integral $\\bar W = \\int_0^1\\int_0^1 W(x,y)\\,dx\\,dy$.\n\n\n\n\n<h3>Player B&#8217;s best response<\/h3>\n<p>Let&#8217;s suppose that Player A uses strategy $p(x)$ and Player B somehow knows this in advance and gets to play the best possible response. What should this response be? For each $y$, $q(y)$ should be chosen to minimize A&#8217;s expected winnings. In other words, we should solve:<br \/>\n\\[<br \/>\nq(y) = \\arg\\underset{q}{\\min} \\int_0^1 \\biggl[<br \/>\n(1-p(x))E(x,y) + p(x)\\bigl( k(1-q)E(x,y) + q \\bigr) \\biggr]\\,dx<br \/>\n\\]The expression on the right is linear in $q$ and the constant terms don&#8217;t affect the argmin. So we conclude that<br \/>\n\\[<br \/>\nq(y) = \\begin{cases}<br \/>\n1 &#038; \\text{if } \\int_0^1 p(x)(1-kE(x,y))\\,dx < 0 \\\\\n0 &#038; \\text{otherwise}\n\\end{cases}\n\\]Splitting the integral for $x\\in[0,1]$ into $x\\in[0,y]$ and $x\\in[y,1]$, we can substitute the definition of $E(x,y)$ and obtain:\n\\begin{align}\n\\int_0^1 p(x)(1-kE(x,y))\\,dx\n&#038;= \\int_0^1 p(x)dx + k\\left( \\int_0^y p(x) dx - \\int_y^1 p(x) dx \\right) \\\\\n&#038;= (1-k)\\int_0^1 p(x)dx + 2k \\int_0^y p(x)dx\n\\end{align}So our final formula for $q(y)$ is:\n\n\n\n<p style=\"text-align: center;\"><span style=\"background-color: #AFC8E6; padding: 25px 10px 25px 10px;\">$\\displaystyle<br \/>\nq(y) = \\begin{cases}<br \/>\n1 &#038; \\text{if } \\int_0^y p(x)dx < \\frac{k-1}{2k}\\int_0^1 p(x)dx  \\\\\n0 &#038; \\text{otherwise}\n\\end{cases}\n$<\/span><\/p>\n<p>This formula already tells us a lot. If $k \\le 1$, the inequality never holds so $q(y)=0$ (always call). If $k > 1$, then $0 < \\frac{k-1}{2k} < \\tfrac{1}{2}$. Since $\\int_0^y p(x)dx$ is a monotonically increasing function no matter what $p$ is, there is a unique $y$ that yields equality. We deduce that $q(y)$ must be a threshold strategy:\n\\[\nq(y) = \\begin{cases}\n1 &#038; \\text{if } 0 \\le y < c \\\\\n0 &#038; \\text{if } c < y \\le 1\n\\end{cases}\n\\]where $c$ is chosen such that $\\int_0^c p(x)dx = \\frac{k-1}{2k}\\int_0^1 p(x)dx$. So fold if your hand is bad, and call if your hand is good. Makes sense!\n\n\n\n<h3>Player A&#8217;s best response<\/h3>\n<p>Let&#8217;s suppose that Player B uses strategy $q(y)$ and Player A somehow knows this in advance and gets to play the best possible response. What should this response be? For each $x$, $p(x)$ should be chosen to maximize A&#8217;s expected winnings. In other words, we should solve:<br \/>\n\\[<br \/>\np(x) = \\arg\\underset{p}{\\max} \\int_0^1 \\biggl[<br \/>\n(1-p)E(x,y) + p\\bigl( k(1-q(y))E(x,y) + q(y) \\bigr) \\biggr]\\,dy<br \/>\n\\]The expression on the right is linear in $p$ and the constant terms don&#8217;t affect the argmin. So we conclude that<br \/>\n\\[<br \/>\np(x) = \\begin{cases}<br \/>\n1 &#038; \\text{if } \\int_0^1 \\bigl( -E(x,y) + k(1-q(y))E(x,y) + q(y) \\bigr) \\,dy > 0 \\\\<br \/>\n0 &#038; \\text{otherwise}<br \/>\n\\end{cases}<br \/>\n\\]Splitting the integral up as $[0,1] = [0,x] \\cup [x,1]$ as we did when we computed Player B&#8217;s best response and simplifying the algebra, we obtain a more complicated formula than last time:<\/p>\n<p style=\"text-align: center;\"><span style=\"background-color: #AFC8E6; padding: 25px 10px 25px 10px;\">$\\displaystyle<br \/>\np(x) = \\begin{cases}<br \/>\n1 &#038; \\text{if }\\,\\, \\frac{k-1}{k}(\\tfrac{1}{2}-x) + \\int_0^x q(y)dy < \\frac{k+1}{2k}\\int_0^1 q(y)dy \\\\\n0 &#038; \\text{otherwise}\n\\end{cases}\n$<\/span><\/p>\n<p>This is a bit trickier than last time because the left-hand side of the inequality isn&#8217;t a simple increasing function in $x$. It contains both an increasing and a decreasing part! So $A$&#8217;s best response might be more complicated than a simple threshold strategy. However, we can leverage the fact that we have a formula for $q(y)$&#8230;<\/p>\n<h3>Combining both best responses<\/h3>\n<p>Substituting Player B&#8217;s threshold response into the formula for Player A&#8217;s best response, we obtain:<br \/>\n\\[<br \/>\np(x) = \\begin{cases}<br \/>\n1 &#038; \\text{if }\\,\\, \\frac{k-1}{k}(\\tfrac{1}{2}-x) + \\min(x,c) < \\frac{k+1}{2k}c \\\\\n0 &#038; \\text{otherwise}\n\\end{cases}\n\\]Working out the cases $ x < c $ and $ x > c $ separately, we deduce that:<br \/>\n\\[<br \/>\np(x) = \\begin{cases}<br \/>\n1 &#038; \\text{if }\\,\\, 0 < x < \\frac{k+1}{2}c-\\frac{k-1}{2} \\\\\n0 &#038; \\text{if }\\,\\, \\frac{k+1}{2}c-\\frac{k-1}{2} < x < \\frac{c+1}{2} \\\\\n1 &#038; \\text{if }\\,\\, \\frac{c+1}{2} < x < 1\n\\end{cases}\n\\]So Player A still plays a threshold strategy... but with two thresholds rather than one! We can now solve for $c$ by substituting $p(x)$ back into the formula $\\int_0^c p(x)dx = \\frac{k-1}{2k}\\int_0^1 p(x)dx$ we derived earlier. This is relatively easy to do because $c$ is always in the middle portion of the interval. i.e. $p(c)=0$. The result is:\n\\[\n\\left( \\tfrac{k+1}{2}c-\\tfrac{k-1}{2} \\right) = \\tfrac{k-1}{2k}\\left[ \\left(\\tfrac{k+1}{2}c-\\tfrac{k-1}{2}\\right) + \\left(1-\\tfrac{c+1}{2}\\right)\\right]\n\\]After simplifications, we obtain:\n\\[\nc = \\frac{(k-1)(k+2)}{k(k+3)}\n\\]We can go back and compute the expected winnings of Player A by integrating $W(x,y)$ using the optimal policies we derived. Upon doing this, we find that the expected winnings for Player A are:\n\\[\n\\bar W = \\frac{k-1}{k(k+3)}\n\\]\n<\/div>\n<p>If you&#8217;d like the tl;dr instead:<br \/>\n<a href=\"javascript:Solution('soln_baby_poker_revisited2','toggle_baby_poker_revisited2')\" id=\"toggle_baby_poker_revisited2\">[Show Solution]<\/a><\/p>\n<div id=\"soln_baby_poker_revisited2\" style=\"display: none\">\n<h3>Optimal policies<\/h3>\n<p>The optimal policy for Player A is:<br \/>\n\\[<br \/>\n\\text{Player A: } \\begin{cases}<br \/>\n\\text{raise} &#038; \\text{if } 0 < x < \\frac{k-1}{k(k+3)} \\\\\n\\text{call} &#038; \\text{if } \\frac{k-1}{k(k+3)} < x < \\frac{k^2+2k-1}{k(k+3)} \\\\\n\\text{raise} &#038; \\text{if } \\frac{k^2+2k-1}{k(k+3)} < x < 1\n\\end{cases}\n\\]The optimal policy for Player B is:\n\\[\n\\text{Player B: } \\begin{cases}\n\\text{fold} &#038; \\text{if } 0 < y < \\frac{(k-1)(k+2)}{k(k+3)} \\\\\n\\text{call} &#038; \\text{if } \\frac{(k-1)(k+2)}{k(k+3)} < y < 1\n\\end{cases}\n\\]The expected payout for Player A is given by the expression:\n\\[\n\\text{Expected payout for Player A:}\\quad \\frac{k-1}{k(k+3)}\\quad\\text{dollars}.\n\\]Here are plots that show the optimal strategies:\n\n\n<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/laurentlessard.com\/bookproofs\/wp-content\/uploads\/2017\/01\/baby_poker_revisited_A.png\" alt=\"\" width=\"1297\" height=\"747\" class=\"aligncenter size-full wp-image-1758\" srcset=\"https:\/\/laurentlessard.com\/bookproofs\/wp-content\/uploads\/2017\/01\/baby_poker_revisited_A.png 1297w, https:\/\/laurentlessard.com\/bookproofs\/wp-content\/uploads\/2017\/01\/baby_poker_revisited_A-300x173.png 300w, https:\/\/laurentlessard.com\/bookproofs\/wp-content\/uploads\/2017\/01\/baby_poker_revisited_A-768x442.png 768w, https:\/\/laurentlessard.com\/bookproofs\/wp-content\/uploads\/2017\/01\/baby_poker_revisited_A-1024x590.png 1024w, https:\/\/laurentlessard.com\/bookproofs\/wp-content\/uploads\/2017\/01\/baby_poker_revisited_A-1200x691.png 1200w\" sizes=\"auto, (max-width: 709px) 85vw, (max-width: 909px) 67vw, (max-width: 1362px) 62vw, 840px\" \/><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/laurentlessard.com\/bookproofs\/wp-content\/uploads\/2017\/01\/baby_poker_revisited_B.png\" alt=\"\" width=\"1297\" height=\"747\" class=\"aligncenter size-full wp-image-1757\" srcset=\"https:\/\/laurentlessard.com\/bookproofs\/wp-content\/uploads\/2017\/01\/baby_poker_revisited_B.png 1297w, https:\/\/laurentlessard.com\/bookproofs\/wp-content\/uploads\/2017\/01\/baby_poker_revisited_B-300x173.png 300w, https:\/\/laurentlessard.com\/bookproofs\/wp-content\/uploads\/2017\/01\/baby_poker_revisited_B-768x442.png 768w, https:\/\/laurentlessard.com\/bookproofs\/wp-content\/uploads\/2017\/01\/baby_poker_revisited_B-1024x590.png 1024w, https:\/\/laurentlessard.com\/bookproofs\/wp-content\/uploads\/2017\/01\/baby_poker_revisited_B-1200x691.png 1200w\" sizes=\"auto, (max-width: 709px) 85vw, (max-width: 909px) 67vw, (max-width: 1362px) 62vw, 840px\" \/><\/p>\n<p>For the case $k=2$, Player A should raise if $x>0.7$ or if $x<0.1$ (a bluff). Meanwhile, Player B should call if $y > 0.4$ and fold otherwise. On average, Player A wins \\$0.10 per game. A fascinating twist is that as $k$ increases, Player A will bluff more aggressively at first, but then will eventually not bluff at all.<\/p>\n<p>The case $k=3$ is special; it corresponds to when Player A is most aggressive (bluffing happens whenever $x < \\tfrac{1}{9} \\approx 0.111$). This also coincides to when the game is most advantageous to Player A; the expected winnings are also \\$0.111. Put another way, if Player A gets to choose <em>how much<\/em> the raise should be, they should choose \\$3! When $k>3$ and as $k$ gets larger, Player A becomes increasingly conservative; raising only very rarely (when a win is all but assured). In this limit, the expected payout for Player A decreases monotonically and converges to \\$0 in the limit.<\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>In a previous post, I took a look at &#8220;baby poker&#8221;, a game involving two players rolling a six-sided die. The higher number wins, but players may elect to raise, call, or fold depending on their number (which only they can see). In this post, I&#8217;ll take a look at the continuous version of the &hellip; <a href=\"https:\/\/laurentlessard.com\/bookproofs\/toddler-poker\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Toddler poker&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":1758,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"_uf_show_specific_survey":0,"_uf_disable_surveys":false,"footnotes":""},"categories":[7],"tags":[33,8,2],"class_list":["post-1749","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-riddler","tag-game-theory","tag-probability","tag-riddler"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/laurentlessard.com\/bookproofs\/wp-json\/wp\/v2\/posts\/1749","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/laurentlessard.com\/bookproofs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/laurentlessard.com\/bookproofs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/laurentlessard.com\/bookproofs\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/laurentlessard.com\/bookproofs\/wp-json\/wp\/v2\/comments?post=1749"}],"version-history":[{"count":14,"href":"https:\/\/laurentlessard.com\/bookproofs\/wp-json\/wp\/v2\/posts\/1749\/revisions"}],"predecessor-version":[{"id":1843,"href":"https:\/\/laurentlessard.com\/bookproofs\/wp-json\/wp\/v2\/posts\/1749\/revisions\/1843"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/laurentlessard.com\/bookproofs\/wp-json\/wp\/v2\/media\/1758"}],"wp:attachment":[{"href":"https:\/\/laurentlessard.com\/bookproofs\/wp-json\/wp\/v2\/media?parent=1749"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/laurentlessard.com\/bookproofs\/wp-json\/wp\/v2\/categories?post=1749"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/laurentlessard.com\/bookproofs\/wp-json\/wp\/v2\/tags?post=1749"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}