Eight of your nine batters are “pure contact” hitters. One-third of the time, each of them gets a single, advancing any runners already on base by exactly one base. (The only way to score is with a single with a runner on 3rd). The other two-thirds of the time, they record an out, and no runners advance to the next base. Your ninth batter is the slugger. One-tenth of the time, he hits a home run. But the remaining nine-tenths of the time, he strikes out. Your goal is to score as many runs as possible, on average, in the first inning. Where in your lineup (first, second, third, etc.) should you place your home run slugger?
Extra Credit: Instead of scoring as many runs as possible in the first inning, you now want to score as many runs as possible over the course of nine innings. What’s more, instead of just having one home run slugger, you now have two sluggers in your lineup. The other seven batters remain pure contact hitters. Where in the lineup should you place your two sluggers to maximize the average number of runs scored over nine innings?
We will attack this problem using dynamic programming. Rather than solving for the average number of runs for a particular configuration (of outs, runners on base, current batter, etc.), we will solve for all possible configurations! To this effect define the following:
\[
V(o,r,b,i) = \left\{
\begin{array}{l}
\text{Expected number of runs at end of}\\
\text{game given that we are currently in}\\
\text{the $i^\text{th}$ inning with $o$ outs, $r$ runners}\\
\text{on base, and the $b^\text{th}$ batter is up.}
\end{array}\right\}
\]Each index can only take on certain values. Define the sets:
\begin{align}
O &= \{0,1,2,3\} && \text{(number of outs)} \\
R &= \{0,1,2,3\} && \text{(runners on base)} \\
B &= \{1,2,\dots,9\} && \text{(batter currently up)} \\
I &= \{1,2,\dots,9\} && \text{(current inning)}
\end{align}Therefore there are $4\times 4\times 9\times 9 = 1296$ possible configurations we must solve for. Finally, some subset of the batters are the sluggers. Let this subset be $S \subseteq B$. The remaining batters are the pure contact hitters, which we denote $\bar S$. Therefore $S\cap \bar S = \emptyset$ and $S \cup \bar S = B$.
Finally, define $p$ to be the on-base average for pure contact hitters, and $q$ to be the slugging percentage for the sluggers. In the problem statement, we have $p=\frac{1}{3}$ and $q=\frac{1}{10}$.
We can now write equations that determine how the various configuration are related to each other:
- When we have 3 outs, clear bases and advance to next inning, or end the game if we are in the 9th inning.\begin{align}
V(3,r,b,i) &= V(0,0,b,i+1) && \forall r\in R, b\in B, i\in\{1,\dots,8\}\\
V(3,r,b,9) &= 0 && \forall r\in R, b\in B
\end{align}This is a total of $4\times 9\times 9 = 324$ equations.
- When a pure contact hitter is up to bat, they either advance runners (and possibly score if bases are loaded), or they strike out.
\begin{align}
V(o,r,b,i) &= p V(o,r+1,b+1,i) + (1-p) V(o+1,r,b+1,i) && \forall o\in \{0,1,2\}, r\in \{0,1,2\}, b\in \bar S, i\in I \\
V(o,3,b,i) &= p (1+V(o,3,b+1,i)) + (1-p) V(o+1,3,b+1,i) && \forall o\in \{0,1,2\}, b\in \bar S, i \in I
\end{align}This is a total of $3\times 4\times 9\times |\bar S| = 108|\bar S|$ equations.
- When a slugger is up to bat, they either hit a home run (and clear the bases), or they strike out.
\begin{align}
V(o,r,b,i) &= q (r+1+V(o,0,b+1,i)) + (1-q) V(o+1,r,b+1,i) && \forall o\in\{0,1,2\}, r\in R, b\in S, i\in I
\end{align}This is a total of $3\times 4\times 9\times |S| = 108|S|$ equations.
In all the equations above, when we write “$b+1$”, we mean that if we get to the end of the order, we should cycle back to the top of the order.
This adds up to a total of $324+108|\bar S|+108|S| = 1296$ equations. We have the same number of equations as variables, which tells us that we accounted for everything!
I used Mathematica to input all the equations programmatically and solve them, and here is what I found.
First part: one inning, one slugger
When looking at a single inning with a single slugger, we pick $I = \{1\}$, and we can choose $S = \{9\}$ for example, and then ask, which batter should bat first to maximize score? this amounts to see which of
\[
V(0,0,b,1),\qquad b=1,2,\dots,9
\]is largest. Using the hitting percentages from the problem ($p=\frac{1}{3}$ and $q=\frac{1}{10}$), we obtain the following bar chart:

So it is best to put the slugger fourth in the batting order. This leads to an expected score of $\frac{3051922249868633}{12708748019768805} \approx 0.240143$ runs scored in the first inning. This is only narrowly better than putting the slugger third, which leads to an average of $0.23937$ runs scored. Overall, here is what we find:
\[
\begin{array}{r|ccccccccc}
\text{slugger position} & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 \\ \hline
\text{expected runs} & 0.184128 & 0.210732 & 0.23937 & 0.240143 & 0.202726 & 0.156235 & 0.168336 & 0.174246 & 0.176961
\end{array}
\]
By changing $p$ and $q$, we get different solutions. For example, if we make the pure contact hitters better, say $p=\frac{2}{3}$, but we keep the slugging at $q=\frac{1}{10}$, we obtain:

So in this case, the slugger is best placed last, presumably because they get out so much more frequently than the pure contact hitters.
Second part: nine innings, two sluggers
With nine innings and two sluggers, I iterated through all possible two-element subsets $S \subseteq B$, and then computed $V(0,0,1,1)$ for each case. This led to the following picture:

This plot shows the expected number of runs after 9 innings as a function of the two sluggers’ positions in the batting order. The best outcome occurs when the sluggers are 3rd and 4th in the batting order. This is the yellow bar in the (3,4) coordinate, which corresponds to an average of 1.99771 runs per game. Note that the y-axis has a relatively small range; it only ranges from 1.85 to 2.0. This goes to show that over the course of 9 innings, the batting order does not make that much difference. Here is a table showing all the results:

Code
To solve this problem analytically, I used Mathematica to specify each of the equations above, and then called the “Solve” function on that system of equations. Each of the equations can be collected into a list using the “Table” function, which makes it easy to enumerate the equations without having to write each one down manually. You can view my Mathematica code here, or in the window below.
Are you sure you did not put in p=2/3 instead of 1/3? The problem stated, “One-third of the time, each of them gets a single, advancing any runners already on base by exactly one base.”
I could not come up with an analytic solution but wrote up a Monte Carlo simulation of an inning.
[Code](https://abiswas3.github.io/notes/html_files/2023-08-21-Bazball.html)
The answer seems to be to put the slugger last when the singles happen with probability 2/3 but put the slugger fourth if the singles happen with probability 1/3.
I could be wrong as this code was written on a train back home from Warwick.
I think your code is correct. I found the bug in my code (a small off-by-one error which dramatically changed the final result…). I fixed it now, and my analytical results match your numerical results. Thanks!
Hi Laurent, can you share your Mathematica code? I would like to understand how to program equations like this.
Thanks!
I was about to ask the same thing — I get how to do the Markov stuff in Mathematica somewhat, but I don’t know how to keep track of the number of runs scored without adding it as a separate parameter of V.
Also: thanks for all these writeups, Laurent!
I just posted the code — can’t find a way to embed it nicely because of the constraints of my WordPress theme, but I added a link to Mathematica Cloud where you should be able to view it properly!
The code is now posted, thanks Laurent!