Does Pitching and Defense Really Win Championships?
*Editor's note: This was a paper written for my Econometrics class in college.*
Introduction:
The game of baseball has never been more divided. Traditionalists who praise batting average and location of pitches have been disregarded by current major league front offices in favor of “Sabermetricians,” a nickname for modern decision makers who use in-depth analytics to evaluate and compare baseball players. This divide between two schools of thought are so dramatic, there was a movie completely centered around their debate, and subsequently the Moneyball era was born.
Sabermetricians, hereby called SABR’s for short, set out to break the game of baseball down to a fundamental level. The offense tries to score as many runs as possible, and the defense tries to prevent as many runs as possible. The name of the game is to score more runs than the other team. This thought process has led to the game of baseball that you see before you today. Home runs, strikeouts, and walks have never been done at this rate before in the history of Major League Baseball[1].
Traditionalists do not like this. Actively berating the modern game of baseball, they favor “small-ball,” with bunting, stolen bases, location, and defense. While the various differences between the differing ideologies are interesting and worth exploring if you do not know them, that is not the purpose of this paper. This paper focuses on one key argument echoed time-and-time again by traditionalists: good pitching beats good hitting. Anybody who has played baseball at any level has heard this argument, but may have never seen any empirical proof that this is true.
This paper sets out to find that proof using econometric analysis. The ultimate goal of any given season of Major League Baseball is to win a championship via the World Series. This paper tests whether various pitching and defensive measures correlate more highly with World Series wins than hitting measures, and how well the various stats can predict World Series wins.
Literature Review:
This topic has been studied before in the baseball community, but in a different way. The difference between this study and existing research is that I am setting out to find whether a playoff team’s regular season statistics predict World Series wins, as opposed to current research that mostly covers whether offensive/defensive statistics predict wins in the regular season.
For example, the article Is it Better to be an Elite Run Producing or Run Preventing Team?[2] discusses the importance of run differential as the main indicator of team record. The study uses ten years’ worth of data from 2001-2010 to determine whether run prevention (defense) or run production (offense) leads to a greater regular season run differential on average. The study concludes that run prevention is a better indicator of run differential than run production, but the article never mentioned how that data predicted playoff/World Series performance.
A more in-depth article, Pitching (almost) Always Wins Championships[3]looks at the same question posed by myself, but again goes at it in a different manner. McDaniel utilizes regular season data from every World Series champion from 1903-2011 (when the article was written), giving 106 results due to the events in 1904 and 1994 resulting in no World Series’ being played. A little background is needed on OPS+ and ERA+ to understand the results. OPS+ and ERA+ are league, park, and era adjusted statistics for OPS (on-base percentage plus slugging percentage) and ERA (earned run average), respectively, in order to accurately compare different teams/players statistics from different seasons/eras. The league average OPS+ and ERA+ is scaled to 100, which means that an OPS+ of 113 represents an OPS 13% better than league average. Now with that in mind, McDaniel uses a scatterplot with OPS+ on the x-axis to represent team offensive production, and ERA+ on the y-axis to represent team run prevention (the scatterplot does not appear in the article, and thus is not shown here). The results showed that the majority of the time, the champion has a combination of good hitting and pitching, which makes sense. Why pick which one you could be better at when you could be good at both? However, the results also showed that below average offensive teams (below 100 OPS+) won roughly one-third of the World Series’ in that time period, a fairly large number. McDaniel argues that this fact, along with similar points, proves that run prevention is more advantageous to winning championships than offense.
However, the fundamental problem with this argument is selection bias. McDaniel only uses data from teams that won the World Series from 1903-2011. This implies the omission of every single team the champion had to face to get there, and even omits the team that they beat to win the championship. What the data provide in this article is an interesting study on World Series winners, but does not necessarily provide any evidence that ERA+ predicts championships better than OPS+. Instead, it shows that the 106 champions from that time period happened to produce more pitching savvy teams than not. Correlation does not necessarily imply causation.
Theoretical Considerations:
There is little to no economic theory relating regular season baseball statistics to playoff performance, as one would imagine. The existing economic theory with regards to baseball has to do with: player salaries, team payrolls, the free-agent market, game theory, and the incentive structure of the Major League Baseball collective bargaining agreement. While these topics are interesting and important in the context of the baseball industry, none present any relevant theory to this paper.
Empirical Strategy:
This paper intends to examine the effects of regular season pitching, defensive, and offensive statistics relative to World Series wins to determine whether any of the three aspects of baseball have a higher correlation to World Series wins than the others.
As a result of the playoff format changing in 2012, it would not be advantageous to utilize data from years prior, as we are interested in how teams regular season data predicts World Series wins now. Due to this reasoning, the data includes all eighty playoff teams regular season data from 2012-2019 (eight seasons, five teams per league per year). Excluding teams who did not make the playoffs assists by eliminating bad teams’ possible outlier data.
Due to the divisiveness of traditional versus modern statistics, I will utilize regressions in all three facets of the game with stats from both categories. All variables will be tested against World Series wins.
Starting with pitching, ERA is the main statistic that almost everyone defaults to. Even though it has value as a SABR stat, I will only use ERA in traditional statistical regressions because true SABR’s usually point to more advanced measures that I will introduce later. WHIP, BB/9, HR/9, H/9, and K/9 are all also useful statistics that I will only use for traditional regressions.
ERA+ is the first statistic I will use for SABR’s regressions. As mentioned earlier, ERA+ is an adjusted ERA measure that assists greatly in comparing numbers over different time periods, so this should be a useful tool in the analysis. FIP and xFIP (adjusted FIP) are similar in appearance to ERA in that a lower number is better, but FIP attempts to only account for events that the pitcher can control (meaning the defense is not involved in the play, i.e. homeruns, walks, and strikeouts). BABIP is an acronym for batting average on balls in play, which essentially is a measure to determine how lucky or unlucky a team is getting on balls that are hit in fair play. These are some of the best statistics that modern decision makers can use when predicting future performance, so they should be plenty useful for this study.
For modern hitting, I will use OPS+, BABIP (same as above but for hitters), wOBA (weighted on-base average), wRC+, BB%, K%, and ISO (isolated power). For traditional, I will use: batting average, on-base percentage, slugging percentage, and on-base plus slugging percentage, For more on these statistics, please see the Data section.
For fielding, I will use errors and fielding percentage for traditional, and UZR and UZR/150 for modern stats. See Data section for more on these metrics.
Data:
All data was pulled from Baseball Reference’s Play Index[4] as well as from Fangraphs[5]. Again, this is all playoff teams regular season data from 2012-2019. The data has 80 observations over this time period.
Below is a brief description of all variables in the data followed by the summary statistics for all the variables.
Definitions:
year- respective playoff team’s year
Tm - name of team
ERA - earned run average
WHIP - walks/hits per inning pitched
H9 - hits allowed per 9 innings
HR9 - home runs allowed per 9 innings
BB9 - base on balls allowed per 9 innings
SO9 - strikeouts per 9 innings
AI - ERA+ (adjusted ERA)
WS - binary World Series variable ; 1 for World Series won, 0 if not
BA - team batting average
OBP - team on-base percentage
SLG - team slugging percentage
OPS - team on-base plus slugging percentage
ISO - isolated-power ; measure of a player’s raw power to convey how often a player hits for extra bases ; found by the formula ISO = SLG-AVG
AG - OPS+ ; adjusted OPS
RG - offensive runs/game
E - errors committed
FP - fielding percentage
UZR - ultimate zone rating ; defensive measure courtesy of Fangraphs ; description can be found here.
UZR - ultimate zone rating per 150 innings of defense
h_BABIP - team’s offensive batting average on balls in play
h_wOBA - team’s offensive weighted on-base average ; more can be found here
h_wRC - team’s offensive wRC+ ; more can be found here
h_BB_pct - team’s offensive walk percentage
h_K_pct - team’s offensive strikeout percentage
p_BABIP - team’s defensive batting average on balls in play
HRFB - team’s defensive home runs per fly ball
FIP - fielding independent pitching ; more can be found here
xFIP - adjusted FIP ; courtesy of Fangraphs
Summary table:


Regression Results:
Regression 1: First, I will estimate a regression on World Series wins using OLS estimators from the offensive side of the ball. For this regression, I will utilize modern statistics, descriptions of which can be found in the Data section. The result is shown below.

With a p-value of 0.0381, this model is significant at the 5% level, we can say with fairly high confidence that the model has some explanatory power, and that 18.07% of the variance in World Series wins can be explained by OPS+, BABIP, wOBA, wRC+, BB%, K%, and ISO.
However, only OPS+, BABIP, and K% are statistically significant at any level in this model (all are significant at the 10% level). Interpreting this, everything else held constant, a one unit increase in OPS+ increases World Series win probability by .033, a one unit increase in BABIP would increase win probability by .14, and a one unit increase in K% decreases World Series win probability by .06 (a negative correlation).
Regression 2: Let’s use the same methodology from the above regression, but use traditional hitting statistics instead.

Interestingly, neither the model, nor the variables are statistically significant at any level, which implies that since the advent of the Wild Card play-in game, traditional statistics have no predictive power over World Series wins.
Regression 3: Time for pitching statistics. I will start off with modern statistics using OLS estimators, descriptions of statistics used can be found in the Data section.

None of the combinations of modern pitching statistics in this dataset proved to be statistically significant, and this model proved no different. Modern pitching statistics have not proved to have statistically significant predictive power over World Series wins in the past eight years.
Regression 4: Transitioning to traditional statistics.

Just as discouraging as Regression 3, the fourth regression proved fruitless. Traditional and modern pitching statistics alike prove to have no predictive power with regards to World Series wins.
Regression 5: Fielding statistics are next, and as a caveat, defensive statistics have largely been stagnant in terms of SABR-type breakthroughs through the years, so some of the most reliable defensive statistics we have are still not very reliable, as they vary a bunch year-over-year. However, one of the more trusted advanced defensive measures has been the Ultimate Zone Rating (UZR) metric found on Fangraphs. While this metric’s reliability/effectiveness is up for debate, it is good enough to depict modern team defense in terms of this study. More can be found regarding this statistic, as well as UZR/150, in the Data section.

Yet another disappointment, this model has no statistical significance, and therefore has no explanatory power.
Regression 6: Traditional fielding statistics have to be taken with a grain of salt, as there is a lot of subjectivity to the statistics because of the judgement calls that scorekeepers are required to make. Let’s see what errors and fielding percentage do to a teams’ World Series probability.

Unfortunately, no conclusive evidence can be gathered. The model has no statistical significance or explanatory power over World Series wins.
Conclusion:
There is a saying in the baseball community that the playoff system is a “crapshoot.” This study proves that saying holds some truth. By repeatedly coming up with unsatisfactory models with insignificant results, one can conclude that when it comes to the playoffs, disregard the regular season, and take a random guess. That’s not to say that regular season statistics don’t matter entirely, teams still have to perform in order to get into the playoffs, but once you are in, you have as good of a chance as anyone.
This paper set out to determine whether pitching, defensive, or offensive statistics correlated more highly with World Series wins than one another, and the result was inconclusive.
What I found however, is that at the 5% level, modern hitting statistics are the best indicators available of playoff performance, although they are not very highly correlated.
Likely due to the small sample size of our data (only eight years of the new playoff format), there was not enough evidence to say that any of the other models were statistically significant. The fix for this study would be to accumulate many more years of data on the subject, which is incredibly unlikely to happen because of the rumored playoff format changes looming in the next Collective Bargaining Agreement[6]. If by some miracle the current structure were to last for another hundred years, that would probably be enough data to make some more clear conclusions about the World Series, as that would take care of the small sample size issue.
My advice following the conclusion of this study is, whenever the regular season ends and the playoffs’ first pitch is about to be thrown, just sit back, relax, and enjoy the randomness of the game of baseball!
[1] Diamond, Jared. “Baseball Has a Home Run Crisis.” The Wall Street Journal, Dow Jones & Company, 8 July 2019, www.wsj.com/articles/baseball-has-a-home-run-crisis-11562603173. [2] Petti, Bill. “Is It Better to Be an Elite Run Producing or Run Preventing Team?” Beyond the Box Score, 22 Feb. 2011, www.beyondtheboxscore.com/2011/2/22/1994723/is-it-better-to-be-an-elite-run-producing-or-run-preventing-team. [3] McDaniel, Rachael. “Pitching (Almost) Always Wins Championships.” The Hardball Times, tht.fangraphs.com/pitching-almost-always-wins-championships/. [4] https://www.baseball-reference.com/play-index/ [5] https://www.fangraphs.com/ [6] https://www.mlb.com/news/mlb-considering-new-postseason-format