footballcommentary.com

 August 13, 2004

# A Dynamic Programming Model For Baseball

Brief Description of the Model
Backward Induction Within A Half Inning
The Sequence of Calculations
Estimation of Probabilities
Numerical Example: Identical Players
Numerical Example: Non-Identical Players
Other Applications of the Model

In this article we present a dynamic programming model for analyzing strategy in a baseball game. In contrast to the more prevalent approach based on expected runs, we assume that teams try to maximize their probability of winning the game. Teams choose optimally whether to attempt to steal, attempt to sacrifice bunt, or intentionally walk an opposing batter. They also choose optimally, when the situation arises, whether to take a force out at home rather than a double play that allows a run to score. The strategy prescriptions of the model are different from those that emerge from an expected-runs analysis. In particular, optimal strategy depends on the inning and the score.

## Brief Description of the Model

At any plate appearance, the opposing team can choose to walk the batter intentionally. If they don't, and there is a runner at first base, he has the option of attempting to steal second. Each runner has a particular probability of success if he attempts to steal. Alternatively, with a runner at first or runners at first and second, and fewer than two outs, the batter has the option of attempting a sacrifice bunt. Each batter has a particular probability of successfully advancing the runner (or runners) if he attempts to sacrifice.

Otherwise, the batter hits away. In this case there are eleven possible outcomes in the model:

1. Strike out or short fly out. Base runners cannot advance.
2. Long fly out. A runner on third can tag up and score.
3. Hard ground out. This permits a double play if there is a man on first.
4. Soft ground out. The batter is out at first (or there is a fielder's choice), but there can be no double play.
5. Unintentional walk or hit by pitch.
6. Short single. A runner on first advances to second, other runners score.
7. Long single. A runner on first advances to third.
8. Short double. A runner on first advances to third.
9. Long double. A runner on first scores.
10. Triple.
11. Home run.

For each player, we specify the probabilities for these eleven outcomes. Thus, in the model, a player is completely characterized by thirteen numbers: his probabilities for the eleven outcomes if he hits away, his probability of successfully executing a sacrifice bunt, and his probability of successfully stealing second base.

There is no need to have separate outcomes for errors. For example, a ground ball that the shortstop throws into the dugout, allowing the batter to reach second base, can be regarded as a short double.

In the model, the various probabilities associated with each player are constant throughout the game. Certainly it would be trivial to allow different probabilities in different innings. Such a change in probabilities could be regarded as a player substitution, but as the model is currently written, it still wouldn't provide for substitutions to be made optimally , i.e., as a function of the situation. This deficiency is less of a problem in the American League, with its Designated Hitter rule, than in the National League.

Hirotsu and Wright have built a model that allows for a limited amount of player substitution to be made optimally. They assume that each batter is characterized by two sets of probabilities, one applicable against right-handed pitchers and one applicable against left-handed pitchers. A team can choose to bring in a right-handed reliever to replace a left-handed pitcher, or vice-versa, and the batting team can bring in a right- or left-handed pinch hitter. Hirotsu and Wright do not model intentional walks, steals, or sacrifice bunts. In addition, they assume that when a batter hits away, there are only six possible outcomes rather than eleven; using our terminology these are strike out, walk, short single, short double, triple, and home run. Of course, either our model or theirs could be expanded to incorporate the features of the other, at the cost of considerably increased complexity and computational time.

## Backward Induction Within A Half Inning

In this section we describe the key component of the model, the computation of the probability of winning the game, and the optimal strategy, at each state that can arise in a particular half inning. Those who want to examine all the details and assumptions of the model should consult the MATLAB® source code inning.m, which should be understandable even to many people with little programming experience.

Suppose our team is at bat. When we make our third out to end the half inning, our probability of winning the game will depend on the inning, the score differential, which batter leads off for the opponents, and who will lead off for us at our next at bat. Let W(d,i,j) denote our probability of winning at the end of the particular half inning in question, where d is the score differential (our runs minus the opponent's runs), batter i will be leading off for the opponents in the next half inning, and batter j will be leading off for us when we next come to bat. (Generally batter j will be the man who was on deck when the third out was made. However, if the final out came when a runner was caught stealing, batter j will be the same man who was batting when the inning ended.)

 Table 1 Situation Notation Bases empty b0 Man on first b1 Man on second b2 Man on third b3 First and second b12 First and third b13 Second and third b23 Bases loaded b123
 Table 2 Situation Notation No outs outs0 One out outs1 Two outs outs2

The win probabilities W(d,i,j) comprise the boundary conditions for the half inning in question. We will defer until the next section the question of how those boundary conditions are determined. In this section, taking the boundary conditions as given, we will describe how to calculate our probability of winning the game at each state that can occur during our half inning at bat. The state is characterized by six variables. The first is the player who is at bat, denoted h (for "hitter"). The second is the score differential d. The third variable is the number of outs; we will use the notation shown in Table 2. The fourth is the occupation of the bases, denoted b. This variable has eight possible values, as shown in Table 1. (In fact, Table 1 orders the possibilities from "least advanced" to "most advanced." By this we mean that if the transition from one state to the next leaves the score and the number of outs constant, it must result in a more advanced occupation of the bases.)

The fifth variable required to characterize a state is the player who is on first base, denoted f. (This variable is needed because the probability of successfully stealing second base varies across players.) The final state variable is the player L for the opponents who will be leading off to start the next half inning. We will denote by V(h,d,outs,b,f,L) our probability of winning the game at state (h,d,outs,b,f,L).

Our intention is to use backward induction to compute our probability of winning the game at every state that can arise during our half inning at bat, using the values W(d,i,j) as boundary conditions. However, an additional condition is needed, because a half inning can in principle go on forever. We will therefore impose the condition that if our lead ever reaches dmax, we win the game for certain. Formally, we assume that

V(h,dmax,outs,b,f,L) = 1

for all values of h, outs, b, f, and L. For the results presented in this article we set dmax= 25. This truncation has no effect on the results of the model. (Nor would it have any effect if it were an actual rule: No major league team has ever won after trailing by more than 12 runs.)

Allowing leads only as high as 25 runs, there are 51 possible score differentials in the model. It follows that within a half inning, there are 892,296 possible states: 9 batters, times 51 score differentials, times 3 outs, times 8 occupations of the bases, times 9 possible men on first, times 9 possible leadoff batters for the other team. (Actually, some of these "states" can't really arise: for a given batter, there are only three players who could possibly be at first base.)

The principle of backward induction is that, given the state and the strategies adopted by the teams, our probability of winning the game is the weighted average of our probabilities of winning the game at the various states that can occur next, weighted by their probabilities of occurrence. Within each half inning, we have to compute the win probability and optimal strategy at each of the 892,296 states. As an illustration, we will lay out the calculations for the 37,179 states in which there is one out and a man on first. Specifically, we will show how to calculate

V(h,d,outs1,b1,f,L),

our probability of winning the game if player h of our team is at bat, we lead by d runs, there is one out, there is a man on first, player f is the runner at first, and player L will be leading off for the other team to start the next half inning. We assume we have already computed our probability of winning at every state that comes "later" than this state. To be precise, we assume that we have already calculated our probability of winning the game at all states in which our lead exceeds d; and at all states in which our lead is d but there is more than one out; and at all states in which our lead is d and there is one out, but in which the occupation of the bases is more advanced than just a man on first.

Let k denote the on-deck hitter; this is batter 1 if h=9, and batter h+1 otherwise. Let n (for "nobody") be any number from 1 to 9. This is an arbitrary input for the fifth argument of the function V, for situations in which first base is unoccupied. (In the source code we set n=1.)

If batter h is intentionally walked, the on-deck hitter comes to the plate with no change in the score, one out, and men on first and second. In addition, the runner at first is now player h. So if the opponents choose an intentional walk, the next state will be (k,d,outs1,b12,h,L) and our probability of winning the game is

walk = V(k,d,outs1,b12,h,L).

Otherwise, we have three options: the runner at first can attempt to steal, or the batter can attempt a sacrifice bunt, or the batter can hit away. Let βf denote the probability of a successful steal by player f. Then if he attempts to steal, our probability of winning the game is

steal = βf V(h,d,outs1,b2,n,L) + (1 − βf) V(h,d,outs2,b0,n,L).

(Notice that the number of outs and the occupation of the bases depend on the outcome of the attempt, but player h remains at bat regardless.) If batter h attempts a sacrifice bunt, we assume it succeeds with probability αh, and hence our probability of winning the game is

bunt = αh V(k,d,outs2,b2,n,L) + (1 − αh) V(k,d,outs2,b1,h,L).

(A successful sacrifice moves the runner to second. Otherwise player f is forced out at second and player h ends up on first.)

 Table 3 Outcome P( Win | Outcome ) 1 Strike out or short fly out V(k,d,outs2,b1,f,L) 2 Long fly out V(k,d,outs2,b1,f,L) 3 Hard ground out W(d,L,k) 4 Soft ground out V(k,d,outs2,b1,h,L) 5 Unintentional walk or hit by pitch V(k,d,outs1,b12,h,L) 6 Short single V(k,d,outs1,b12,h,L) 7 Long single V(k,d,outs1,b13,h,L) 8 Short double V(k,d,outs1,b23 ,n,L) 9 Long double V(k,d+1,outs1,b2,n,L) 10 Triple V(k,d+1,outs1,b3,n,L) 11 Home run V(k,d+2,outs1,b0,n,L)

If batter h hits away, there are 11 possible outcomes, which we described earlier, and which are listed in Table 3. Let ph1,...,ph11 denote the probabilities of these 11 outcomes, for player h. The conditional probability that we win the game, conditional on one of these outcomes, depends on the resulting state, and is shown in the right-hand column of Table 3. For example, a long double scores a run and leaves a man on second, and a hard-ground-ball out results in an inning-ending double play.

If batter h hits away, our probability of winning the game is the weighted average of the conditional win probabilities corresponding to the 11 possible outcomes, weighted by their probabilities of occurrence:

hitaway = ∑m phm P( Win | Outcomem ).

We will of course choose the largest of "steal," "bunt," and "hitaway." However, if the largest of these exceeds "walk," the opponents will intentionally walk batter h. Hence our probability of winning the game at state (h,d,outs1,b1,f,L) is

V(h,d,outs1,b1,f,L) = min( walk, max( steal, bunt, hitaway ) ).

## The Sequence of Calculations

We now move from a single half inning to consideration of the game as a whole. Let g index the half innings of the game. (For example, g = 9 refers to the top of the 5th inning.) Let Vg and Wg be the win probabilities and boundary conditions for half inning g.

We work backwards from the end of the game. As described below, once we have computed the win probabilities for all states in half inning g, we can determine the boundary conditions for half inning g − 1 . We then compute the win probabilities for all states in half inning g − 1 , and so on. This process continues until we have computed the win probabilities in the top of the 1st inning. The details of the calculations are contained in the MATLAB® source code bb.m, which calls the programs inning.m and probsetup.m.

Because the model doesn't take into account factors like player fatigue and substitutions, any extra inning is formally identical to the 9th inning. Specifically, the win probability and optimal strategy at a state (h,d,outs,b,f,L) in a particular half of an extra inning are identical to the win probability and optimal strategy in the corresponding state in the same half of the 9th inning. Consequently, the bottom of the 9th inning is the chronologically latest half inning for which we require win probabilities.

Nevertheless, the easiest way to get the boundary conditions for the bottom of the 9th inning is to model extra innings. The approach we adopted is to assume that if the game is still tied after Imax innings, it is decided by a coin flip. This assumption provides a starting point for the backward induction, and creates no loss of accuracy provided Imax is large enough that the computed win probabilities for the 9th inning closely approximate their limits as Imax → ∞.

Suppose g = 2 × Imax, so we are dealing with the bottom of inning Imax. At the end of that half inning, the home team wins if it's ahead, loses if it's behind, and faces a coin flip if the game is still tied. The boundary conditions are therefore that Wg(d,i,j) equals 1, 0, or 0.5 according to whether d is greater than, less than, or equal to zero. Using these boundary conditions, we apply the backward induction described in the previous section to compute the home team's probability of winning the game at each state in the bottom half of inning Imax. These probabilities are the values of the function Vg(h,d,outs,b,f,L). In particular,

Vg(h,d,outs0,b0,n,L)

is the probability that the home team wins the game, at the start of the bottom of inning Imax, if batter h is leading off and the score differential is d. (The variable L doesn't matter in the bottom of inning Imax.) It follows that

 (Boundary Conditions Equation)

Wg-1(d,i,j) = 1 − Vg(i, −d ,outs0,b0,n,j)

is the probability that the visitors win the game, if they lead by d runs at the end of the top of inning Imax. The function Wg-1(d,i,j) is therefore the boundary condition for the top of inning Imax.

In fact, the Boundary Conditions Equation shown above is the correct form for the boundary conditions if g − 1 is the top half of any inning, or even if g − 1 is the bottom half of any inning chronologically prior to the 9th. For the bottom half of inning Imax, we described the boundary conditions earlier. For the bottom half of innings 9 through Imax − 1, we have to take into account that the game ends if the score is not still tied. Therefore, for those half innings, Wg-1(d,i,j) is given by the Boundary Conditions Equation when d=0, but is 0 when d < 0 and 1 when d > 0. Details can be found in the source code bb.m.

## Estimation of Probabilities

One should never rely exclusively on historical data when estimating the various probabilities associated with the players. However, historical data must surely play a role, particularly for the probabilities of the eleven outcomes when a batter hits away. The Official Web Site for Major League Baseball provides considerable relevant data, including breaking out intentional walks and (since 1999) recording ground-ball outs and fly-ball outs. Those data are not helpful, though, for the more subtle distinctions we require, such as short-versus-long singles, or hard-versus-soft ground ball outs, although complete play-by-play records would be. In addition, even the data that are available don't always paint an accurate picture. For example, a team will sometimes "pitch around" a batter; the almost inevitable walk doesn't show up in the statistics as intentional, but it should be regarded as such for our purposes.

Strictly speaking, the various probabilities in the model should depend on the particular opposing pitcher.

 Table 4 Outcome P(Outcome) 1 Strike out or short fly out 0.293 2 Long fly out 0.125 3 Hard ground out 0.12 4 Soft ground out 0.12 5 Unintentional walk or hit by pitch 0.087 6 Short single 0.0875 7 Long single 0.0875 8 Short double 0.0225 9 Long double 0.0225 10 Triple 0.007 11 Home run 0.028

## Numerical Example: Identical Players

In this section we will illustrate some results of the model for a case in which all the players on both teams are identical. Suppose that for every player, the probabilities of the various outcomes of a plate appearance (if he doesn't sacrifice bunt and isn't intentionally walked) are as shown in Table 4. These are of course the probabilities ph1,...,ph11 we referred to earlier; here we assume they are independent of h. It's easy to check that a batter with these probabilities will (expectationally) have a batting average of .279, a slugging percentage of .436, and (if he is never intentionally walked) an on-base percentage of .342.

We assume in addition that each player has a 0.5 probability of success if he tries to steal second base, and that if he attempts a sacrifice bunt, he has a 0.9 probability of successfully advancing the runner.

Using the incorrect objective function of expected runs, and using these parameters, it turns out that teams never attempt to steal or sacrifice bunt, or intentionally walk a batter; and they never forgo a double play in order to prevent a run from scoring. But in our model, in which the objective function is the probability of winning the game, each of the actions just mentioned is optimal under certain circumstances. Some of these situations are so obvious that they serve mainly as a check of the computer code. For example, if the game is tied in the bottom of the 9th, with the bases loaded and no one out, the visitors prefer a force out at home to a double play that allows a run to score.

To understand the deficiencies of an analysis based on expected runs, consider how a team's probability of winning the game varies as a function of the score differential. Obviously, the probability of winning is an increasing function of the score differential; and in fact an expected-runs analysis implicitly assumes the relationship is linear. But since the probability of winning must lie between zero and one, the function can't be linear. The probability of winning, as a function of the score differential, must become concave for positive differentials (leads) and convex for negative differentials (deficits).

When all the players on both teams are identical, the probability of winning (at the end of a particular half inning) depends only on the score differential. The Figure at left plots this function for the end of the 6th inning. The vertical axis is the home team's probability of winning, and the horizontal axis is their lead. The shape of this function shows why a sacrifice bunt, which lowers a team's expected runs by reducing the chances of a big inning, might nevertheless be optimal in some cases if the team is already ahead: a big inning doesn't give nearly a proportional increase in the probability of winning the game. It becomes relatively more important to reduce the probability of not scoring any runs at all. Of course, if the team is behind, the argument is reversed. A big inning becomes even more valuable than the expected-runs criterion suggests, so the trailing team should eschew the sacrifice and hit away.

The model allows us to list all the situations in which it is optimal to attempt to steal or sacrifice bunt, or to intentionally walk a batter. We will do so for a few selected half innings, using the parameters in Table 4. These results are for illustration only. They demonstrate that the optimal strategy in our model depends on the inning and the score, in contrast to the results of an expected-runs analysis. However, in an application one would have to use input probabilities that correspond to the actual players on the two teams.

First consider the bottom of the 9th inning. Suppose the game is tied and there is a man on first. With no one out it's optimal to sacrifice bunt; with one out the batter should hit away; and with two outs it's optimal for the runner to attempt to steal. With the input probabilities we are using, there is no other situation in the bottom of the 9th in which a sacrifice bunt or attempted steal is optimal.

Intentional walks are optimal in more situations. In the bottom of the 9th the visitors should intentionally walk the batter if the game is tied, there is 0 or 1 out, and there is either a man on 2nd, or a man on 3rd, or men on 1st and 3rd, or men on 2nd and 3rd. In addition, if the visitors lead by a run in the bottom of the 9th, they should intentionally walk the batter if there is 0 or 1 out and runners are on 2nd and 3rd. These are the only situations in which an intentional walk is optimal in the bottom of the 9th inning.

Next consider the top of the 8th inning. It turns out that with our input parameters, neither a sacrifice bunt nor an attempted steal is ever optimal. However, with 1 out and men on 2nd and 3rd, the home team should issue an intentional walk if they trail in the game. In addition, with none out and men on 2nd and 3rd, the home team should intentionally walk the batter if they trail by four or more runs.

Finally, consider the top of the 1st inning. With our input parameters, neither a sacrifice bunt nor an attempted steal is ever optimal. Moreover, the only situation in which the home team should intentionally walk a batter is if there is 1 out with men on 2nd and 3rd, and they trail by 12 or more runs. Of course, it's hard to see how any team can get that far behind so early in the game, and having managed to do so, they're almost certain to lose no matter what strategy they adopt.

## Numerical Example: Non-Identical Players

As a further illustration, in this section we will present some results for a case in which all the players on a team have different characteristics. For simplicity we continue to assume that the two teams are identical, although this is not a requirement in the model. Table 5 shows for each player the probabilities of the various outcomes if he hits away. In addition,

 Table 5 Probabilities of Outcomes Player 1 2 3 4 5 6 7 8 9 1 Strike out or short fly out 0.272 0.291 0.292 0.195 0.257 0.276 0.333 0.342 0.363 2 Long fly out 0.125 0.143 0.107 0.120 0.106 0.123 0.159 0.099 0.133 3 Hard ground out 0.141 0.103 0.117 0.078 0.132 0.132 0.107 0.147 0.124 4 Soft ground out 0.141 0.103 0.117 0.078 0.132 0.132 0.107 0.147 0.124 5 Unintentional walk or hit by pitch 0.073 0.083 0.066 0.255 0.093 0.066 0.087 0.029 0.062 6 Short single 0.102 0.097 0.111 0.067 0.090 0.100 0.075 0.081 0.057 7 Long single 0.102 0.097 0.111 0.067 0.090 0.100 0.075 0.081 0.057 8 Short double 0.020 0.018 0.019 0.024 0.023 0.031 0.020 0.022 0.022 9 Long double 0.020 0.018 0.019 0.024 0.023 0.031 0.020 0.022 0.022 10 Triple 0.003 0.040 0.003 0.002 0.009 0.002 0.002 0.015 0.001 11 Home run 0.002 0.006 0.038 0.092 0.044 0.009 0.016 0.015 0.035

we assume that the probability of successfully stealing second base is 0.55 for batters 1 and 2, 0.45 for batters 8 and 9, and 0.5 for the others. Finally, we assume each batter has a 0.9 probability of successfully advancing the runner if he attempts to sacrifice bunt. Table 6 shows the expected batting average, on-base percentage (aside from intentional walks), and slugging percentage that are implied by the probabilities in Table 5.

 Table 6 Batting Statistics AVE OBP SLG 1 .267 .321 .322 P 2 .302 .360 .450 L 3 .322 .366 .491 A 4 .369 .529 .808 Y 5 .308 .372 .522 E 6 .291 .338 .390 R 7 .226 .294 .325 8 .242 .265 .364 9 .208 .257 .370

In an expected-runs analysis using these parameters, it is never optimal for a team to attempt to steal or sacrifice bunt. Actually, even in our model, situations in which it is optimal to attempt to steal remain relatively rare (although they would of course be more common if we assigned the players larger probabilities of success). In fact, in games that are remotely close, the earliest point at which an attempted steal can be optimal is the bottom of the 6th inning. In that half inning, if player 6 is at bat with 2 outs and a man on 1st, and the home team leads by at least 5 runs, then the runner at first should attempt to steal.

Sacrifice bunts, on the other hand, can be optimal in any inning. For example, suppose it's the top of the 1st inning, there is no one out, runners are at first and second, and batter 9 is at the plate. Then with our parameters, it is optimal to sacrifice. Situations in which sacrifice bunts are optimal become more numerous in later innings. Moreover, the larger is a team's lead, the more situations there are in which that team should attempt to steal or sacrifice bunt.

The expected batting statistics for player 4 (see Table 6) are reminiscent of Barry Bonds's. So, as a final illustration of the use of the model, it's interesting to determine the situations in which it is optimal to walk player 4 intentionally.

Using expected runs as the objective function, there is exactly one situation in which intentionally walking player 4 is optimal; namely, 1 out and runners at 2nd and 3rd. This is true regardless of the inning or the score. However, using the probability of winning the game as the objective function, there are many more such situations. We will list all of them that arise in the top if the 1st inning, with the game tied or the visitors leading by fewer than 8 runs. The set of situations in which it is optimal to walk player 4 intentionally is larger in later innings. We must repeat that these results apply only for the particular set of parameters we have chosen, and so are for illustration only.

In the top of the 1st inning, the home team should intentionally walk player 4 if there are 1 or 2 outs with men on 2nd and 3rd; or if there are 2 outs, a man on 2nd or a man on 3rd, and the home team trails; or if there is no one out, runners are on 2nd and 3rd, and the home team trails by at least 3 runs. There are many additional situations in which the home team should intentionally walk player 4 in the top of the 1st inning, but only when the visitors lead by at least 8 runs.

## Other Applications of the Model

Using a model that does not include strategic decisions, Bukiet, Harold, and Palacios showed how to derive the probability distribution for runs scored in a game, given the characteristics of the nine players in the lineup. We could use our model to analyze some of the same questions they addressed, including the optimal batting order, player evaluation, and trades.

For example, suppose a team is considering trading Player A for Player B. By replacing the thirteen numbers that characterize Player A by the thirteen numbers that characterize Player B (and possibly changing the batting order), the team can use the model to see how much the probability of winning changes. Notice that the value of a player doesn't depend solely on his own characteristics. It also depends on the characteristics of his teammates (and his opponents, for that matter).