{"id":1726,"date":"2017-01-01T18:11:49","date_gmt":"2017-01-02T00:11:49","guid":{"rendered":"http:\/\/www.laurentlessard.com\/bookproofs\/?p=1726"},"modified":"2018-02-18T15:02:40","modified_gmt":"2018-02-18T21:02:40","slug":"baby-poker","status":"publish","type":"post","link":"https:\/\/laurentlessard.com\/bookproofs\/baby-poker\/","title":{"rendered":"Baby poker"},"content":{"rendered":"<p>Another game theory problem from the <a href=\"http:\/\/fivethirtyeight.com\/features\/can-you-deal-with-these-card-game-puzzles\/\">Riddler<\/a>. This game is a simplified version of poker, but captures some interesting behaviors!<\/p>\n<blockquote><p>\nBaby poker is played by two players, each holding a single die in a cup. The game starts with each player anteing \\$1. Then both shake their die, roll it, and look at their own die only. Player A can then either &#8220;call,&#8221; in which case both dice are shown and the player with the higher number wins the \\$2 on the table, or Player A can &#8220;raise,&#8221; betting one more dollar. If A raises, then B has the option to either &#8220;call&#8221; by matching A\u2019s second dollar, after which the higher number wins the \\$4 on the table, or B can &#8220;fold,&#8221; in which case A wins but B is out only his original \\$1. No other plays are made, and if the dice match, a called pot is split equally.<\/p>\n<p>What is the optimal strategy for each player? Under those strategies, how much is a game of baby poker worth to Player A? In other words, how much should A pay B beforehand to make it a fair game?\n<\/p><\/blockquote>\n<p>If you&#8217;re interested in the derivation (and maybe learning about some game theory along the way), you can read my full solution here:<br \/>\n<a href=\"javascript:Solution('soln_baby_poker','toggle_baby_poker')\" id=\"toggle_baby_poker\">[Show Solution]<\/a><\/p>\n<div id=\"soln_baby_poker\" style=\"display: none\">\n<p>Our first task will be to figure out the payout of the game as a function of the players&#8217; strategies. For each possible dice roll, the Player A must decide whether to call ($0$) or raise ($1$). Similarly, Player B must decide whether to call ($0$) or fold ($1$). An example strategy for Player A is:<br \/>\n\\[<br \/>\np = \\begin{bmatrix} 0 &#038; 0 &#038; 0 &#038; 0 &#038; 1 &#038; 1 \\end{bmatrix}<br \/>\n\\]This corresponds to Player A calling if they roll $\\{1,2,3,4\\}$ and raising if they roll $\\{5,6\\}.$ This is an example of a <a href=\"https:\/\/en.wikipedia.org\/wiki\/Strategy_(game_theory)#Pure_and_mixed_strategies\">pure strategy<\/a> because it completely determines what the player does as a function of the information available. We can encode each pure strategy as a six-digit binary number. There are $2^6 = 64$ possible pure strategies for each player and we can number them from $0$ to $63$ based on the decimal representation. For example, strategy $44$ corresponds to $101100$ in binary, i.e. strategy $p = \\begin{bmatrix}1 &#038; 0 &#038; 1 &#038; 1 &#038; 0 &#038; 0\\end{bmatrix}$. We can also consider <a href=\"https:\/\/en.wikipedia.org\/wiki\/Strategy_(game_theory)#Pure_and_mixed_strategies\">mixed strategies<\/a>, where actions are probabilistic. An example of a mixed strategy for Player A is:<br \/>\n\\[<br \/>\np = \\begin{bmatrix} 0.3 &#038; 0 &#038; 1 &#038; 1 &#038; 0 &#038; 0 \\end{bmatrix}<br \/>\n\\]This strategy is identical to the previous example, except if Player A rolls a $1$, they will raise $30$% of the time and call $70$% of the time. Mixed strategies allows players to play randomly if they so desire! Pure strategies are special cases of mixed strategies in which the probabilities are either $0$ or $1$ (deterministic).<\/p>\n<p>Let&#8217;s call $0 \\le p_i \\le 1$ the strategy of Player A if they roll $i \\in \\{1,\\dots,6\\}.$ Likewise, let&#8217;s call $0 \\le q_j \\le 1$ the strategy of Player B if they roll $j \\in \\{1,\\dots,6\\}.$ Define the $6\\times 6$ matrix $E$ as:<br \/>\n\\[<br \/>\nE_{ij} = \\begin{cases}<br \/>\n1 &#038;\\text{if }i > j \\\\<br \/>\n0 &#038;\\text{if }i = j \\\\<br \/>\n-1 &#038;\\text{if }i < j\n\\end{cases}\n\\]This matrix tells us how much Player A wins if they call with an $i$ and the opponent rolls $j.$ Player A raises with probability $p_i$ and calls with probability $1-p_i.$ Similarly, Player B folds with probability $q_j$ and calls with probability $1-q_j.$ Since each pair $(i,j)$ is equally likely, the expected winnings for Player A are:\n\\[\nW = \\frac{1}{36} \\sum_{i=1}^6 \\sum_{j=1}^6 \\bigl( (1-p_i)E_{ij} + p_i\\left( 2(1-q_j) E_{ij} + q_j \\right) \\bigr)\n\\]The last term looks the way it does because if Player B folds, Player A wins $+1$ with certainty. If Player B calls, then we use the $E_{ij}$ matrix to determine the payoff, but with a factor of 2 because the stakes were doubled.\n\n\n\n<h3>Pure strategy equilibria<\/h3>\n<p>Define $P_{ab}$ to be the winnings of Player A from the $W$ formula above when players adopt pure strategies $a,b \\in \\{0,1,\\dots,63\\}$ respectively. This yields the $64 \\times 64$ matrix $P$ corresponding to the winnings for all possible combinations of pure strategies. Here is what $P$ looks like:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/laurentlessard.com\/bookproofs\/wp-content\/uploads\/2017\/01\/baby_poker_payoff.png\" alt=\"\" width=\"800\" height=\"600\" class=\"aligncenter size-full wp-image-1746\" srcset=\"https:\/\/laurentlessard.com\/bookproofs\/wp-content\/uploads\/2017\/01\/baby_poker_payoff.png 800w, https:\/\/laurentlessard.com\/bookproofs\/wp-content\/uploads\/2017\/01\/baby_poker_payoff-300x225.png 300w, https:\/\/laurentlessard.com\/bookproofs\/wp-content\/uploads\/2017\/01\/baby_poker_payoff-768x576.png 768w\" sizes=\"auto, (max-width: 709px) 85vw, (max-width: 909px) 67vw, (max-width: 984px) 61vw, (max-width: 1362px) 45vw, 600px\" \/><\/p>\n<p>Player A is trying to maximize $P_{ab}$ while Player B is trying to minimize it (it&#8217;s a <a href=\"https:\/\/en.wikipedia.org\/wiki\/Zero-sum_game\">zero-sum game<\/a>, so minimizing the winnings of one player is equivalent to maximizing the losses of the other player). Put another way, Player A selects a row of the matrix $P$ (corresponding to a choice among 64 pure strategies), while Player B selects a column (again, corresponding to a choice among pure strategies). The number in the chosen cell determines how much Player A will win. A <a href=\"https:\/\/en.wikipedia.org\/wiki\/Nash_equilibrium\">Nash Equilibrium<\/a> is a cell $P_{ab}$ that satisfies<br \/>\n\\[<br \/>\nP_{\\bar a b} \\le P_{ab} \\le P_{a \\bar b}\\qquad \\text{for all }\\bar a,\\bar b \\in \\{1,\\dots,64\\}<br \/>\n\\]In other words, $P_{ab}$ is the largest element in its column (so Player A has no incentive to pick a different row) and it&#8217;s also the smallest element in its row (so Player B has no incentive to pick a different column).<\/p>\n<p>A natural question is to ask is: is there an optimal pair of pure strategies? This amounts to searching the above $P$ matrix for an element that is simultaneously the largest in its column and smallest in its row. A quick search reveals that no such element exists. In other words, there are no pure equilibria for this game!<\/p>\n<h3>Mixed strategy equilibria<\/h3>\n<p>Every mixed strategy can be expressed as a <a href=\"https:\/\/en.wikipedia.org\/wiki\/Convex_combination\">convex combination<\/a> of pure strategies. For example:<br \/>\n\\[<br \/>\n\\begin{bmatrix}1\\\\\\tfrac{1}{3}\\\\0\\\\0\\\\\\tfrac{3}{4}\\\\1\\end{bmatrix} =<br \/>\n\\frac{1}{3}\\begin{bmatrix}1\\\\1\\\\0\\\\0\\\\1\\\\1\\end{bmatrix} +<br \/>\n\\frac{5}{12}\\begin{bmatrix}1\\\\0\\\\0\\\\0\\\\1\\\\1\\end{bmatrix} +<br \/>\n\\frac{1}{4}\\begin{bmatrix}1\\\\0\\\\0\\\\0\\\\0\\\\1\\end{bmatrix}<br \/>\n\\]This follows from the fact that each mixed strategies lies in the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Convex_hull\">convex hull<\/a> of all pure strategies. Also, since the winnings $W$ are bilinear in $p$ and $q$, the winnings for a convex combination of strategies is the convex combination of the corresponding winnings.<\/p>\n<p>Now let $u \\in [0,1]^{64}$ and $v \\in [0,1]^{64}$ be the coefficients of the convex combinations of the pure strategies for Players 1 and 2 respectively. The winnings of Player A can be written compactly as $u^\\textsf{T} P v$. Player A would like to select $u$ in order to maximize this quantity, while Player B would like to select $v$ in order to minimize it. All this is subject to the constraint that $0 \\le u_i \\le 1$ for all $i$ and $u_1 + \\dots + u_{64} = 1$, and similarly for $v$. Note that in the special case where $u$ and $v$ contain a single &#8220;1&#8221; and the rest is &#8220;0&#8221;, we recover the case of seeking pure strategies.<\/p>\n<p>Nash famously proved that every game with finitely many players and actions has a mixed strategy equilibrium. So all that&#8217;s left to do is find it! The set of Nash equilibria can be expressed as the optimal set of a <a href=\"https:\/\/en.wikipedia.org\/wiki\/Linear_programming\">linear program<\/a>. In particular, the optimal strategies $u,v$ are solutions to the dual linear programs:<br \/>\n\\[<br \/>\n\\begin{aligned}<br \/>\n\\underset{u,\\mu}{\\max}\\quad&#038; \\mu \\\\<br \/>\n\\text{s.t.}\\quad&#038; u\\ge 0,\\quad \\mathbf{1}^\\textsf{T} u = 1 \\\\<br \/>\n&#038; P^\\textsf{T} u \\ge \\mu \\mathbf{1}<br \/>\n\\end{aligned}<br \/>\n\\qquad<br \/>\n\\begin{aligned}<br \/>\n= \\qquad<br \/>\n\\underset{v,t}{\\min}\\quad&#038; t \\\\<br \/>\n\\text{s.t.}\\quad&#038; v\\ge 0,\\quad \\mathbf{1}^\\textsf{T} v = 1 \\\\<br \/>\n&#038; P v \\le t \\mathbf{1}<br \/>\n\\end{aligned}<br \/>\n\\]The first program (Player A) has a unique solution, which is the mixture:<br \/>\n\\[<br \/>\np_\\text{opt} =<br \/>\n\\frac{1}{3}\\begin{bmatrix} 0 \\\\ 0 \\\\ 0 \\\\ 0 \\\\ 1 \\\\ 1 \\end{bmatrix} +<br \/>\n\\frac{2}{3}\\begin{bmatrix} 1 \\\\ 0 \\\\ 0 \\\\ 0 \\\\ 1 \\\\ 1 \\end{bmatrix} =<br \/>\n\\begin{bmatrix} \\tfrac{2}{3} \\\\ 0 \\\\ 0 \\\\ 0 \\\\ 1 \\\\ 1 \\end{bmatrix}<br \/>\n\\]The second program (Player B) the solution is not unique, and is rather complicated. As with any linear program, the solution set is a polytope. In this case, the polytope has 15 vertices and lives in a 7-dimensional subspace of $\\mathbb{R}^{64}$ (neat, huh?). Projected back down to $\\mathbb{R}^6$ where $q$ lives, every optimal solution occupies a 3-dimensional subspace. Specifically,<br \/>\n\\[<br \/>\nq_\\text{opt} \\sim<br \/>\n\\begin{bmatrix} 1 \\\\ x \\\\ y \\\\ z \\\\ 0 \\\\ 0 \\end{bmatrix}<br \/>\n\\]where $0 \\le x,y \\le 1$ and $0 \\le z \\le \\tfrac{2}{3}$ and $x+y+z =\\tfrac{4}{3}$.<\/p>\n<p>Each optimal solution has the same optimal value (i.e. the optimal value of $\\mu$ or $t$ in the linear programs for $u$ and $v$). The optimal value is $\\tfrac{5}{54} \\approx 0.0926$. So on average, Player A comes out ahead to the tune of about 9.26 cents when both players are playing optimally.<\/p>\n<\/div>\n<p>This alternate solution was proposed by a commenter named Chris. Same answer, but a simpler argument!<br \/>\n<a href=\"javascript:Solution('soln_baby_pokerC','toggle_baby_pokerC')\" id=\"toggle_baby_pokerC\">[Show Solution]<\/a><\/p>\n<div id=\"soln_baby_pokerC\" style=\"display: none\">\n<p>The solution proceeds much like the previous solution, but this time, we rewrite the cost as a quadratic form in the vectors $p$ and $q$:<br \/>\n\\begin{align}<br \/>\nW &#038;= \\frac{1}{36} \\sum_{i=1}^6 \\sum_{j=1}^6 \\bigl( (1-p_i)E_{ij} + p_i\\left( 2(1-q_j) E_{ij} + q_j \\right) \\bigr) \\\\<br \/>\n&#038;= \\frac{1}{36} \\left( p^\\textsf{T}(E \\mathbf{1}) + p^\\textsf{T}(\\textbf{1}\\textbf{1}^\\textsf{T}-2E)q \\right) \\\\<br \/>\n&#038;= p^\\textsf{T} b + p^\\textsf{T} A q<br \/>\n\\end{align}where $b = \\tfrac{1}{36}E\\mathbf{1}$, and $A = \\tfrac{1}{36}\\left(\\textbf{1}\\textbf{1}^\\textsf{T}-2E\\right)$, and $\\textbf{1}$ is the vector of all ones. We can now <em>directly<\/em> write linear programs for $p$ and $q$. The first player is trying to maximize $W$ under the assumption that the second player is trying to minimize $W$. In other words, we should solve:<br \/>\n\\begin{align}<br \/>\n&#038;\\phantom{=}\\underset{\\textbf{0} \\le p \\le \\textbf{1}}{\\max}\\quad\\underset{\\textbf{0} \\le q \\le \\textbf{1}}{\\min}\\quad  p^\\textsf{T} b + p^\\textsf{T} A q\\\\<br \/>\n&#038;= \\underset{\\textbf{0} \\le p \\le \\textbf{1}}{\\max} \\left( p^\\textsf{T} b + \\sum_{j=1}^6 \\underset{0\\le q_j \\le 1}{\\min}  q_j (A^\\textsf{T}p)_j \\right) \\\\<br \/>\n&#038;= \\underset{\\textbf{0} \\le p \\le \\textbf{1}}{\\max} \\left( p^\\textsf{T} b + \\sum_{j=1}^6 \\min\\left( 0, (A^\\textsf{T}p)_j \\right) \\right) \\\\<br \/>\n&#038;= \\left\\{\\begin{aligned}<br \/>\n\\underset{p,t}{\\text{maximize}}&#038;\\quad p^\\textsf{T} b + t^\\textsf{T} \\textbf{1} \\\\<br \/>\n\\text{subject to:}&#038;\\quad \\textbf{0} \\le p \\le \\textbf{1} \\\\<br \/>\n&#038; t \\le \\textbf{0},\\quad t \\le A^\\textsf{T} p<br \/>\n\\end{aligned}\\right.<br \/>\n\\end{align}Likewise, the second player is trying to minimize $W$ under the assumption that the first player is trying to maximize $W$. We can proceed in a similar manner to above and we obtain:<br \/>\n\\begin{align}<br \/>\n&#038;\\phantom{=}\\underset{\\textbf{0} \\le q \\le \\textbf{1}}{\\min}\\quad\\underset{\\textbf{0} \\le p \\le \\textbf{1}}{\\max}\\quad  p^\\textsf{T} b + p^\\textsf{T} A q\\\\<br \/>\n&#038;= \\underset{\\textbf{0} \\le q \\le \\textbf{1}}{\\min} \\left( \\sum_{i=1}^6  \\max_{0\\le p_i \\le 1} p_i(Aq+b)_i \\right) \\\\<br \/>\n&#038;= \\underset{\\textbf{0} \\le q \\le \\textbf{1}}{\\min} \\left( \\sum_{i=1}^6  \\max\\left( 0, (Aq+b)_i \\right) \\right) \\\\<br \/>\n&#038;= \\left\\{\\begin{aligned}<br \/>\n\\underset{q,\\mu}{\\text{minimize}}&#038;\\quad \\mu^\\textsf{T} \\textbf{1} \\\\<br \/>\n\\text{subject to:}&#038;\\quad \\textbf{0} \\le q \\le \\textbf{1} \\\\<br \/>\n&#038; \\mu \\ge \\textbf{0},\\quad \\mu \\ge A q + b<br \/>\n\\end{aligned}\\right.<br \/>\n\\end{align}Solving these two linear programs yields the same optimal $p$ and $q$ vectors as in the previous approach, but this time our linear programs have far fewer variables! The objective values of both LPs match as well and are both equal to $\\tfrac{5}{54} \\approx 0.926$, as before. In fact, we can check algebraically that the two LPs are <a href=\"https:\/\/en.wikipedia.org\/wiki\/Linear_programming#Duality\">dual to one another<\/a>. Since both are clearly feasible (easy to check that the feasible set is compact and nonempty), strong duality holds and the two LPs <em>must have the same optimal value<\/em>. This is in essence Nash&#8217;s celebrated result; that the minimax and maximin formulations are equivalent!<\/p>\n<\/div>\n<p>If you&#8217;d just like to know the answer along with a brief explanation, here is the tl;dr version:<br \/>\n<a href=\"javascript:Solution('soln_baby_poker2','toggle_baby_poker2')\" id=\"toggle_baby_poker2\">[Show Solution]<\/a><\/p>\n<div id=\"soln_baby_poker2\" style=\"display: none\">\n<p><strong>Player A&#8217;s optimal strategy:<\/strong><\/p>\n<ul>\n<li> If you roll $\\{5,6\\}$ then <em>always raise<\/em>.\n<li> If you roll $\\{2,3,4\\}$ then <em>always call<\/em>.\n<li> If you roll a $1$, then employ the following random strategy: call with probability $\\tfrac{1}{3}$ and raise with probability $\\tfrac{2}{3}$.\n<\/ul>\n<p>This makes sense; if you have a strong hand you should raise because you stand to win more that way. If your hand is weak you should call because you&#8217;ll likely lose and this minimizes your losses. However, if your hand is very weak, you should occasionally bluff to trick your opponent into thinking you have a strong hand!<\/p>\n<p><strong>Player B&#8217;s optimal strategy:<\/strong><\/p>\n<ul>\n<li> If you roll $\\{5,6\\}$ then <em>always call<\/em>.\n<li> If you roll a $1$ then <em>always fold<\/em>.\n<li> If you roll $\\{2,3,4\\}$ then employ a random strategy where you assign probabilities of folding of $x,y,z$ to $2,3,4$ respectively, where $x+y+z = \\tfrac{4}{3}$ and $z \\le \\tfrac{2}{3}$. One simple viable strategy is to fold with probability $\\tfrac{4}{9}$ whenever $\\{2,3,4\\}$ is rolled.\n<\/ul>\n<p>This makes sense; if your hand is strong you should call, if your hand is weak you should fold, and if you&#8217;re in no-man&#8217;s land, then you should go one way or the other. Why might we expect there to be multiple viable strategies? When Player A raises using the optimal strategy, he either has a strong hand with $\\{4,5,6\\}$, or a very weak hand with $\\{1\\}$. Given this information, if Player B rolls a $3$, it&#8217;s no better or worse than rolling a $2$! Among these infinitely many viable strategies, each has the same expected payoff so it doesn&#8217;t matter which one is chosen.<\/p>\n<p>When both players play optimally, Player A is expected to win $\\tfrac{5}{54}$ dollars, or about <strong>9.26 cents<\/strong> on average. So this is what Player A should pay to Player B beforehand in order to make the game fair.<\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Another game theory problem from the Riddler. This game is a simplified version of poker, but captures some interesting behaviors! Baby poker is played by two players, each holding a single die in a cup. The game starts with each player anteing \\$1. Then both shake their die, roll it, and look at their own &hellip; <a href=\"https:\/\/laurentlessard.com\/bookproofs\/baby-poker\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Baby poker&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":1746,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"_uf_show_specific_survey":0,"_uf_disable_surveys":false,"footnotes":""},"categories":[7],"tags":[33,27,8,2],"class_list":["post-1726","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-riddler","tag-game-theory","tag-linear-programming","tag-probability","tag-riddler"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/laurentlessard.com\/bookproofs\/wp-json\/wp\/v2\/posts\/1726","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/laurentlessard.com\/bookproofs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/laurentlessard.com\/bookproofs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/laurentlessard.com\/bookproofs\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/laurentlessard.com\/bookproofs\/wp-json\/wp\/v2\/comments?post=1726"}],"version-history":[{"count":21,"href":"https:\/\/laurentlessard.com\/bookproofs\/wp-json\/wp\/v2\/posts\/1726\/revisions"}],"predecessor-version":[{"id":2236,"href":"https:\/\/laurentlessard.com\/bookproofs\/wp-json\/wp\/v2\/posts\/1726\/revisions\/2236"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/laurentlessard.com\/bookproofs\/wp-json\/wp\/v2\/media\/1746"}],"wp:attachment":[{"href":"https:\/\/laurentlessard.com\/bookproofs\/wp-json\/wp\/v2\/media?parent=1726"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/laurentlessard.com\/bookproofs\/wp-json\/wp\/v2\/categories?post=1726"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/laurentlessard.com\/bookproofs\/wp-json\/wp\/v2\/tags?post=1726"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}