Statistical applications in sports have long centered on how to best separate signal (e.g., team talent) from random noise. However, most of this work has concentrated on a single sport, and the development of meaningful cross-sport comparisons has been impeded by the difficulty of translating luck from one sport to another. In this manuscript we develop Bayesian state-space models using betting market data that can be uniformly applied across sporting organizations to better understand the role of randomness in game outcomes. These models can be used to extract estimates of team strength, the between-season, within-season and game-to-game variability of team strengths, as well each team’s home advantage. We implement our approach across a decade of play in each of the National Football League (NFL), National Hockey League (NHL), National Basketball Association (NBA) and Major League Baseball (MLB), finding that the NBA demonstrates both the largest dispersion in talent and the largest home advantage, while the NHL and MLB stand out for their relative randomness in game outcomes. We conclude by proposing new metrics for judging competitiveness across sports leagues, both within the regular season and using traditional postseason tournament formats. Although we focus on sports, we discuss a number of other situations in which our generalizable models might be usefully applied.
Gregory J. Matthews, Ph.D. is an assistant professor of statistics and director of the data science program at Loyola University Chicago. He graduated from the University of Connecticut with his Ph.D. in statistics in 2011 and completed a postdoctoral research fellowship in the School of Public Health and Health Sciences at the University of Massachusetts-Amherst in 2014. His current research interests include statistical shape analysis, incomplete data analysis, statistical analysis of sports, and statistical consulting. He is one of the authors of openWAR and a recipient of the 2016 Society for American Baseball Research (SABR) Contemporary Baseball Analysis Award for this work. He maintains and active blog and has written articles in popular media for FiveThirtyEight, Baseball Prospectus, and Deadspin.
Science is everywhere—even inside a baseball. Since 2017, Major League Baseball has experienced two Home Run Surges, breaking any number of records in the process. In both cases, the culprit has been the ball; specifically, each home run spike was the result of a baseball with less drag. While MLB has focused largely on the aerodynamics, I looked at the ball itself. Over the course of three studies (and more than 100 disassembled baseballs), I found that both Home Run Surges could be accounted for by small—and perhaps unintentional—changes to the ball’s construction. Because this was applied (i.e. practical) research, it was hands-on in a way people rarely associate with science. Rather than wind tunnels or computer simulations, my data-gathering tools included a T-pin, a box cutter, a kitchen scale, an old bookshelf, and a chopstick. In addition, the analysis and conclusions drew on more than my formal training in physics. Without a detailed knowledge of baseball and extensive experience in fiber arts (knitting, sewing etc.), these problems might never have been solved—or if so, would have taken years rather than weeks. While these studies were deceptively straightforward, they have also proved groundbreaking. Despite numerous smaller home run surges over the last century, never before has one been traced back to a physical change to the ball. Finding the source of not one, but two, Home Run Surges, demonstrates that good science depends on much more than education. It requires creativity, the ability to reformulate a question, and a willingness to occasionally get your hands dirty (or at least develop calluses).
Meredith J. Wills, Ph.D., is a Data Scientist for SportsMEDIA Technology (SMT). After graduating from Harvard University (1996) with a degree in Astronomy & Astrophysics, she went on to garner a Masters (1999) and Ph.D. (2003) in Physics from Montana State University—Bozeman. Her Masters work involved Physics Education Research, while her Ph.D. focused on Solar Astrophysics. After completing her graduate studies, Dr. Wills did NASA-funded research, first as a Research Scientist at Southwest Research Institute and then as a Senior Research Scientist at the Harvard-Smithsonian Center for Astrophysics. Her work on the coronal origins of solar storms ultimately led to a new subfield in solar astrophysics. In 2012, Dr. Wills transitioned to sports data science, with an interest in player- and ball-tracking. She joined SMT in 2018, where she works primarily with FIELDf/x, a ball- and play-tracking system used by Minor League affiliates of Major League ballclubs, international leagues, and NCAA. She also writes for The Athletic, and her best-known publicly-available research involves MLB baseball construction and its effect on the game. In addition to her technical interests, Dr. Wills is a knitting designer, and some of her knitted creations can be seen at the Baseball Hall of Fame and Museum in Cooperstown. At present, she is working in partnership with both the Hall of Fame and the Negro Leagues Baseball Museum to create reproductions of vintage baseball sweaters. Ongoing documentation of the project can be found at hofknitter.blogspot.com.
We give an overview of how data visualization and analysis can be used in the sports industry in a variety of contexts. We discuss how analytics can be used to assist a team’s front office, coaching staff, and scouting department make better and faster decisions. We also discuss the kinds of data and optimization problems that are encountered on the business side of an organization in departments like sales and marketing. Examples include determining shooting tendencies of players, forecasting performance of teams, predicting attendance for games, optimizing realignment for leagues, and evaluating regular season schedules. Particular attention will be paid to the value of using visualizations throughout the data analysis process, both for the initial steps of data exploration and for the later steps of communicating the results of the analysis to key stakeholders.
Brian Macdonald, Ph.D., is the Director of Sports Analytics in the Stats & Information Group at ESPN. He was previously the Director of Hockey Analytics with the Florida Panthers Hockey Club, an Adjunct Professor in the Department of Management Science at the University of Miami, an Adjunct Professor in Sports Analytics in the College of Business at Florida Atlantic University, and an Associate Professor in the Department of Mathematical Sciences at West Point. He received a Bachelor of Science in Electrical Engineering in 2000 from Lafayette College, Easton, PA, and a Master of Arts in 2003 and a Ph.D. in Mathematics in 2008 from Johns Hopkins University, Baltimore, MD.