CSAS 2026 : Sessions

Sessions

Session A: From Data to Decisions in Sports Performance and Injury Risk
Session B: Modern Soccer Analytics: Tracking Data, Movement Modeling, and Real-World Constraints
Poster Session

Session A: Modern Soccer Analytics: Tracking Data, Movement Modeling, and Real-World Constraints

The Messy Middle: Why Life is Easier at the Top of the English Football Pyramid

Sarah Rudd, src | ftbl.

Abstract: Football analytics has waited close to a decade for success stories like Liverpool, Arsenal, Brentford and Brighton. While it's great to celebrate such successes, life outside of the top of the footbally pyramid is often more complex and messier, in ways that aren't frequently discussed or considered. This talk explores the compounding challenges facing football analytics outside the Premier League and why muddy pitches and small stadiums don't just impact the performance on the pitch, but the data behind it as well.

Off-Ball Run Value: Evaluating Off-Ball Runs Using Tracking Data

Meredith Shea, Vassar College; Gaku Aihara, Vassar College; Miles Sondergaard Jensen, Vassar College.

Abstract: In this work we develop a multi-component metric to evaluate the quality of off-ball runs in the game of soccer. The metric evaluates runs on three criteria: the potential pass created, the space creation for teammates upfield, and the support created for the ball carrier. Importantly, to measure the spatial components of the run we develop an algorithm that produces a counterfactual state, referred to the no-run state. We believe this methodology can be applied to evaluate and measure other spatial manipulations to the game.

Movement Dynamics in Elite Female Soccer Athletes: The Quantile Cube Approach

Kendall L. Thomas, University of North Carolina at Chapel Hill; Jan Hannig, University of North Carolina at Chapel Hill.

Abstract: This presentation introduces the quantile cube, a novel three-dimensional summary representation designed to analyze external load using GPS-derived movement data. By segmenting athlete movements into discrete quantiles of velocity, acceleration, and angle, this framework captures complex dynamics often missed by aggregate metrics. We first detail the statistical validation of the method using data from elite female soccer athletes across 23 matches. Analysis using Principal Component Analysis and Dirichlet-multinomial regression reveals significant half-to-half variations and distinct position-specific movement profiles. We then extend this methodology beyond retrospective analysis to demonstrate its deployment in a high-performance setting. Specifically, we showcase the application of the quantile cube by a professional women’s soccer team for in-season monitoring. We detail how this probabilistic approach integrates into daily workflows, allowing practitioners to assess workload accumulation over time. This presentation bridges the gap between rigorous statistical modeling and practical, on-field performance optimization in women’s soccer.

Operationalizing Tracking Data

Akshay Easwaran, United States Soccer Federation.

Abstract: The latest chapter in the soccer data arms race features the rapid hiring of data scientists to squeeze insights out of x/y tracking data. However, cutting-edge tracking data research often relies on a small sample of matches (usually up to ~500), while modern data providers, especially with the introduction of broadcast tracking data in recent years, deliver tens-of-thousands of matches to their clients. As a result, analytics staff must make complex technical decisions about how to efficiently store and transform this vast universe of raw files within their existing systems. How can analytics teams overcome scale and compute challenges to foster their own competitive edges from tracking data? Find out in our session.

Session B: From Data to Decisions in Sports Performance and Injury Risk

From Paper to Pipeline: Deploying Tracking Data Solutions in Production

Kevin O'Donnell, U.S. Soccer Federation.

Abstract: There is no shortage of problems to solve for a soccer analytics department. Conferences and research papers present a host of novel ideas to improve decision making for a club or federation. The real challenge for a data science team is transforming this research into a productionized solution. In this presentation, we will cover how to progress from research to production through a real-life tracking data example from the U.S. Soccer Federation. Join our session to gain insight into our process, our learnings, and the technical and conceptual challenges we face.

Acceleration Exposure and Asymmetric Loading Profiles in Ice Hockey: Implications for Neuromuscular Demand

Julie P. Burland, University of Connecticut; Neal R. Glaviano, University of Connecticut; Ainsley Svetek, University of Connecticut.

Abstract: Introduction: Accelerations are a primary contributor to mechanical load in ice hockey, characterized by frequent, high-intensity changes in speed and direction. However, total acceleration volume may not reflect how load is distributed between limbs during skating and transitional movements. Asymmetric loading, arising from injury, compensatory strategies, or skating mechanics, may alter force production and absorption, increasing tissue stress. This is particularly relevant in hockey athletes with prior musculoskeletal injury, where asymmetries may persist despite return-to-play clearance. Understanding the relationship between acceleration frequency and asymmetric loading is critical for improving workload monitoring and guiding rehabilitation and future injury prevention in hockey.

Methods: Twenty-one female and 24 male collegiate ice hockey players were monitored over a 1-week period during both practices and games. Data were collected using bilateral, tri-axial insole-embedded inertial measurement units (Plantiga) placed inside each skate. The total number of accelerations per session was quantified alongside load asymmetry, which was calculated as the percent difference between limbs. Negative values indicated greater left limb loading, while positive values indicated greater right limb loading. Spearman’s correlation analyses were performed between the number of accelerations and asymmetry values across sessions. Statistical significance was set at p > .05. All statistics were performed using R.

Results: A total of 203 observations were included in the analysis. Descriptive statistics indicated that mean total accelerations per session were 44.85 ± 15.69, while mean asymmetry values were −1.03 ± 6.16%, reflecting a slight bias toward left limb loading at the group level. Shapiro–Wilk tests demonstrated that both asymmetry and total accelerations were not normally distributed (p < .001), supporting the use of non-parametric analyses. Spearman’s rank-order correlation revealed no significant relationship between total accelerations and asymmetric loading (ρ = −0.077, p = .272). This suggests that the number of accelerations performed during a session was not associated with the magnitude or direction of interlimb loading asymmetry in this cohort.

Discussion: Acceleration frequency was not associated with asymmetric loading in collegiate ice hockey athletes. This suggests that acceleration volume alone may not reflect how load is distributed between limbs during skating. The lack of association likely reflects the complex nature of skating mechanics, where technique, prior injury, and individual movement strategies influence limb loading independent of acceleration count. Clinically, these findings highlight limitations of relying solely on external workload metrics. Direct measures of asymmetry may provide greater insight for rehabilitation and injury risk, warranting further investigation into individual responses and more sensitive workload measures.

A Self-Organized Criticality Framework for Understanding Exertional Heat Stroke and Other Injuries

Mary Salvana, University of Connecticut.

Abstract: Catastrophic events — from infrastructure failures to athlete collapses — share a common statistical fingerprint: power-law distributions signaling proximity to a critical threshold. This presentation introduces Self-Organized Criticality (SOC) as a unifying framework for modeling and anticipating extreme, cascading failures across complex systems. Unlike traditional models that assume independent failures and Gaussian distributions, SOC reframes rare disasters as statistically inevitable outcomes of accumulated stress in systems that naturally evolve toward instability. We demonstrate SOC's predictive power across power grids, air traffic networks, and athlete safety, showing that the critical exponent α can serve as an early warning signal — detected months before collapse — enabling anticipatory rather than reactive risk management.

The Next Performance Enhancing Substance: Integrating Physiological Load and Functional Biomarker Monitoring in Sport. An Athlete Centered Approach.

Robert Huggins, University of Connecticut.

Abstract: The integration of training load monitoring through wearable technology and functional physiological biomarker monitoring in sport is leading to advancements in athlete health and performance. Recent advances in wearable technology, player load and performance dashboards, and at-home testing of fluid-based nutritional biomarkers has enabled sports performance, nutrition and exercise science staff to target both team and individualized needs of athletes. These advancements and the data that they produce allow for the optimization of load during sport and the recovery tactics and strategies outside of sport. The integration of data analytics in this setting will lead the next wave of performance enhancement in sport.

Poster Session (in submission order)

1. GHOST: A Novel Deep Learning Framework for Quantifying NFL Receiver Gravity

Gordan Tao, University of North Carolina at Chapel Hill, SAIL; Aneesh Sallaram, University of North Carolina at Chapel Hill, SAIL

Abstract: Existing NFL receiver evaluation metrics emphasize outcomes such as yards, receptions, or touchdowns, but overlook the defensive behaviors that shape such results. We introduce GHOST (Generative Heliocentricity for Offensive Separation Tracking), a deep learning framework for quantifying defensive attention and receiver gravity in NFL passing plays. GHOST fills the gap between simple receiver metrics and deeper tactical insights by measuring heliocentricity, defined as the divergence between expected defensive coverage and actual defensive positioning relative to a target receiver. To model baseline defensive expectations, we employ a Transformer-based Conditional Variational Autoencoder (CVAE) that generates multiple plausible defensive trajectory forecasts per play, conditioned on game context. Using the projected trajectories with observed tracking data, we yield a heliocentricity score 𝑯 which captures the degree to which a receiver draws atypical defensive attention independent of production. Applying GHOST to 2023 NFL player- tracking, we uncover insights that complement traditional metrics, highlighting receivers who consistently attract coverage and spotlighting offensive designs that systematically manufacture separation.

2. AI-Powered Natural Language Query System for Sports Analytics (Tennis)

Rami Zheman, Independent Researcher

Abstract: As a sports fan, I’ve never found a tool where I can ask a complex question about a match in natural language and get an accurate answer and explanation from real data in seconds. Most existing analytics systems rely on dashboards or text-to-SQL approaches that break down when questions depend on sequences of actions, such as how players respond to specific shots or what patterns lead to success.

I built a natural-language query system for tennis that decides how a question should be answered before computing it. The system uses AI to interpret what kind of question is being asked, then routes it to the appropriate execution path: statistical questions using a tree structure built from event logs, ordered shot-sequence analysis when sequence matters, or explanation-only responses when “how” and “why” questions are asked.

All counting and filtering are performed deterministically, while language models are used for interpretation and explanation. This avoids common cases where systems return confident but misleading results. While demonstrated on tennis match data, the approach applies to any sport with event-driven logs where the order of actions matters.

3. Sweeping Away the Competition

Connor J. Brady, Trinity Preparatory School; Ishan V. Choksey, Trinity Preparatory School; Olliver M. Polsinelli, Trinity Preparatory School

Abstract: This paper applies interpretable machine learning to mixed doubles curling to quantify how early-end geometry shapes scoring, defense, and strategy. Using data from 2,637 ends, we develop a modular XGBoost framework centered on the post–Shot 3 game state, where spatial control first constrains outcomes. Physically meaningful features—distance to the button, stone ratios, and house control—allow the model to capture nonlinear interactions while remaining interpretable. Across specialized sub-models, we measure hammer value, identify geometry-driven swings of ~40 percentage points in scoring probability, predict scoring magnitude, detect rare blank ends, evaluate steal risk, and quantify volatility from power plays. Model performance exceeds baseline approaches by 4.5–5.7 percentage points in cross-validation, demonstrating the predictive value of early-house geometry.

Beyond static prediction, we integrate these models into a simulation framework running idealized mixed doubles ends. Starting from post–Shot 3, the simulation evaluates plausible shot options by applying them to copied ice configurations and scoring each resulting position using trained sub-models, with adjustments for score differential, end number, and power play. Looking one to several shots ahead, the simulation identifies high-value decision paths, illustrating how early geometry compounds into end-level outcomes rather than isolated shot effects.

Together, the modeling and simulation results translate complex spatial data into actionable strategy, providing coaches and players probabilistic guidance for shot selection, power play timing, and defensive planning. This work demonstrates how interpretable machine learning and simulation can inform real-time decision making in a strategically rich sport.

4. Identifying NFL Quarterback Archetypes via K-Means Clustering

Shubhan Tamhane, University of Connecticut

Abstract: Quarterback classification remains central to NFL evaluation and team-building strategies, yet most archetype frameworks rely on subjective media narratives rather than quantitative analysis. This study applies unsupervised machine learning to identify quarterback playing styles using six seasons of performance data (2019-2024) from qualifying NFL quarterbacks. Principal Component Analysis combined with K-means clustering identified four distinct archetypes: Pocket Passer, Game Manager, Gunslinger, and Dual Threat, which align with established football schema while revealing significant boundary overlap between categories. One-way ANOVA tests confirmed statistically significant differences across all examined metrics (all p < 0.001), validating cluster meaningfulness despite a moderate silhouette score of 0.208. Analysis revealed that high-performing quarterbacks of different archetypes often display similar efficiency ratings while differing in other elements such as rushing involvement and pass volume. These findings suggest that while quarterback archetypes represent legitimate performance patterns, they exist along a spectrum or a continuum rather than as rigid discrete categories. This work provides a data-driven framework for player evaluation and challenges the oversimplified archetype narratives prevalent in sports media and scouting.

5. Predicting the Perfect Team: and how much they have improved

Emily Chorbajian, University of Connecticut

Abstract: The purpose of this study was to create the composition of a winning swimming and diving team as well as calculate their percent improvement. To create the winning team composition, we looked into what events are the most critical during high stakes meets such as Big East Championships and what made the most sense to have athletes swim to get the highest score. We looked at other teams in the Big East and saw where they scored the most points and used that information to influence where we want to put our athletes to have the best chance at winning. To create the improvement percentages, we took swimmers best times out of high school and compared them to their best time from college in their top three events. With this information we calculated average percent improvements for each event. This can be used as a recruiting tool to show incoming athletes what current swimmers’ average improvement is as well as average improvement for each stroke and event.

6. BayesBall : A comprehensive framework for predicting UCL injury.

Brady M. Pinter, Belmont University; Dr. Will J. Best, Ph.D., Belmont University; Dr. Christina Davis, Ph.D., NA

Abstract: Ulnar Collateral Ligament (UCL) reconstruction, commonly referred to as Tommy John Surgery, has seen a significant rise among Major League Baseball (MLB) pitchers, prompting growing interest in identifying the mechanical and performance-based factors that contribute to injury risk. While previous studies have examined these relationships using traditional frequentist approaches separately, this study combines multiple different model techniques to present a broad framework for finding significant predictors of UCL Surgery. These models include Lasso and Ridge Regression, Principal Component Regression (PCR) , Partial Least Squares Regression (PLS) , Random Forest, Multiple Linear Regression, and a Bayesian Statistical Model. Using these models, our findings aim to refine injury prediction models and provide a more comprehensive statistical framework for understanding the mechanics underlying UCL injuries in professional baseball.

7. See Every Rally: Computer Vision Based Recreational Tennis Analytics Tool

Wyatt Jones, Belmont University; Dr. Christina Davis, Ph.D, Belmont University; Dr. William Best, Ph.D, Belmont University

Abstract: Detailed performance analytics in tennis have long been reserved for professional players with access to expensive tracking systems. Today, applications do exist for recreational players, but often have high costs and limitations. This project builds a tool that takes regular smartphone or camera footage of a tennis match and automatically tracks where players move, how they cover the court, and where the ball goes throughout the match. Using computer vision and AI-based detection models (DINOv2), the system processes raw video and pulls out meaningful performance data without requiring specialized equipment. That data is then displayed in a simple, visual dashboard that gives players easy-to-understand insights like court coverage maps, movement patterns, and rally trends. The goal is to give recreational tennis players access to the same kind of performance feedback that professionals rely on, without the costs.

8. Analyzing On-Court Biomechanical Asymmetries in Female Basketball Players

Jacob Schlessel, University of Connecticut

Abstract: Advances in wearable sensor technologies have given strength coaches and athletic trainers unprecedented insights into how athletes move during sport-related activities and how they respond to training. The strength and conditioning team for UConn Women’s Basketball has recently adopted Plantiga Technologies to monitor athletes over the course of the NCAAW season, using in-sole sensors to measure unilateral biomechanical asymmetries of athletes during on-court activities in real-time. By analyzing asymmetries in unilateral push-off forces, impact forces, and acceleration counts, the team can identify early risk factors for lower body injuries and can work to correct them through systematic training and rehabilitation regimens, which motivates the need for an in-depth statistical analysis of how these asymmetries vary based on the type of activity being performed and how they tend to relate to each other over time. Using a linear mixed-effects model, this project aims to quantify the variation in each of these three types of asymmetries observed during the three most important types of activities these athletes endure during the season: practices, games, and individual on-court workouts. Furthermore, this project seeks to quantify the temporal correlation between these asymmetries and external workload, which could provide the strength and conditioning staff with insights into how athletes respond to and recover from strenuous activities. This project takes an individualized approach, modeling each player separately to account for the unique movements patterns, injury histories, and tendencies of each individual on the team.

9. Professional Disc Golf Player Ratings, Predictions, and Event Probability Estimation.

Everett B. Hargrove, Iowa State University; Dan Nettleton, Iowa State University

Abstract: We develop an approach for modeling the scores of professional disc golfers across multiple tournaments in a disc golf season. We use a linear model with average score relative to par for a player and tournament as the response variable and additive player, tournament, and error effects to explain the responses. A Bayesian approach is used to estimate posterior means for player and tournament effects and to obtain posterior predictive distributions. Fitting our model to data provides a natural method for rating professional disc golfers, predicting results of future tournaments, and estimating various event probabilities of interest to fans of professional disc golf. We illustrate our approach by applying it to scores from Disc Golf Pro Tour events in each of 2023, 2024, and 2025 for both men's and women's divisions. We rate professional disc golfers in each of these seasons, identify players who improve or decline from season to season, estimate player win probabilities in the lead up to major tournaments, and show how to estimate player win probabilities in real time for tournaments in progress given current scores. We also show how model-based probability estimates of various types can be used to answer questions about whether observed outcomes should be considered unusual or as expected.

10. Quantifying Spatial Control and Uncertainty in NFL Player Tracking Data

Owen Babiec, University of Connecticut

Abstract: Recent advances in player tracking technology have enabled researchers to analyze the spatial dynamics of team sports with unprecedented detail. In the National Football League (NFL), tracking data records the location and movement of every player multiple frames per second, creating opportunities to measure how teams create and contest space during a play. Prior work by Kyle Burris (2019) introduced a probabilistic framework for estimating controlled space, which quantifies the likelihood that a player can reach different regions of the field before opponents. While this approach provides insight into spatial structure, much of the existing work remains descriptive and leaves questions regarding how spatial advantage translates to play outcomes. This project builds upon the controlled space framework by developing additional spatial metrics that quantify how space is distributed and contested during NFL plays. Using NFL player tracking data, spatial control probability surfaces were computed to estimate which players control different areas of the field at each moment in time. From these surfaces, a descriptive metric was implemented: contestedness (or spatial entropy), which measures the uncertainty or disorder in the spatial distribution of player control. Together, these measures provide a more complete characterization of how offensive opportunities and defensive pressure emerge throughout a play. The motivation for this work is that spatial advantage is central to strategy in football, yet existing metrics rarely quantify how space itself is structured or contested. By introducing entropy-based and probabilistic measures of spatial control, this project provides new tools for describing the dynamics of player positioning. Future work will extend this framework toward an Expected Space Value (ESV) model that links spatial control to play outcomes such as yards gained or expected points added. By integrating spatial probability models with predictive analytics, this next step aims to move beyond descriptive metrics and estimate the value of space in terms of offensive success. Ultimately, this work contributes a methodological foundation for connecting spatial dynamics with performance evaluation in football analytics.

11. A Run Expectancy Approach to Lead Distance Optimization in Major League Baseball

Zach Sissman, Community School of Naples, Naples, FL; Lila Dodson, San Francisco University High School, San Francisco, CA; Jack Whitney-Epstein, The Brunswick School, Greenwich, CT

Abstract: A runner’s primary lead off first base trades off steal success against pickoff risk. Using 2024 MLB pitch-level data and Baseball Savant tracking, we represent this interaction as a sequential process: the pitcher may attempt a pickoff; if attempted, it may succeed; if no pickoff is attempted, the runner may attempt a steal; and if attempted, it may succeed. We fit a nested logistic mixed-effects model in which each stage probability is a function of lead distance, pitcher/runner/catcher characteristics, and game context, with runner random effects to capture persistent skill.

We convert estimated stage probabilities to expected runs using fixed linear weights (+0.20 for a successful steal; −0.45 for caught stealing or a pickoff) and optimize over lead distance to obtain an optimal lead L*. When runners deviate from optimal, conservative leads (shorter than optimal) average 1.4 ft too short, while aggressive leads (longer than optimal) average 1.41 ft too long, indicating roughly symmetric error magnitudes in both directions. The method puts the steal–pickoff trade-off on a common expected-runs scale and yields context-aware lead targets for coaches and players.

12. Power Play, Shot Selection, and Scoring Outcomes in Elite Mixed Doubles Curling

Mark Zhang, University of Connecticut

Abstract: The Power Play rule in mixed doubles curling allows the hammer team to reposition the pre-placed stones at the start of an end to create a more aggressive scoring configuration, typically intended to produce a multi-point end rather than a single point. Despite its strategic importance in elite competition, empirical evidence on how Power Play is associated with tactical shot selection and end outcomes remains limited. Using end-level data from the World Mixed Doubles Curling Championships (2016–2025) and the Olympic Winter Games (2018, 2022), we analyze the association between Power Play use, hammer-team shot selection, and scoring outcomes from the hammer-team perspective. End outcomes are classified as 0 points, 1 point, or 2 or more points, and hammer-team shot selection is measured as the share of setup shots among classified hammer shots within an end. We first examine how Power Play is associated with hammer-team shot selection across score states and end timing. We then estimate its association with end outcomes using a multinomial logit model adjusting for score state, end group, and relative team strength, supplemented by descriptive comparisons of actual hammer-team points scored. Power Play is associated with differences in hammer-team shot selection across score states. Across game situations, it is associated with a lower probability of scoring zero points, a higher probability of scoring two or more points, and higher expected hammer-team scoring, with the largest differences observed in late-end comeback situations. Changes in scoring dispersion vary by game state.

13. Bayesian Semi-parametric Inference for Predicting Time Between Fouls for WNBA Players

Rohit Kanrar, Iowa State University; Zack Swayne, Fabick Cat; Dan Nettleton, Iowa State University

Abstract: Using play-by-play data from the Women’s National Basketball Association (WNBA), we determined – for each player and game in the 2024 season – the amount of time played until committing the first foul, the amount of time played between fouls for each subsequent foul, and the amount of time played after committing the last foul. We used the resulting data to estimate, for each WNBA player, the distribution of time played between foul k-1 and foul k for k=1,…,6, where the time of foul 0 is defined as the time the player entered the game. In our approach, we consider the time played between fouls k-1 and k as a survival time and consider the time a player plays after committing their last foul as a censored survival time. We model these observed and censored survival times using a Cox proportional hazards model. We use a Bayesian semi-parametric approach to estimate a baseline hazard function as well as posterior distributions of foul effect parameters for k=1,…,6 and player effect parameters for each WNBA player. Given the estimated foul time distributions, we can predict the amount of time any player with k fouls will play before being disqualified for committing a sixth foul, and we can estimate the probability that a player with k fouls will be able to play through the end of the game without disqualification. Such information may be useful for WNBA coaches as they consider how to best manage playing time for players with k fouls.

14. Team Ratings, Win Probabilities, and Margin-of-Victory Predictions for High School Football via Linear Models

Blake Behrens, Iowa State University; Dan Nettleton, Iowa State University

Abstract: Since 2018, the Iowa High School Athletic Association has used a version of the Ratings Percentage Index (RPI) to select which football teams qualify for the playoffs and compete for the state title. This method uses the team's win percentage, the win percentage of their opponents, and their opponents' opponents' win percentage. Each team is assigned a rating based on a weighted average of these three percentages. Using data from Iowa High School football games played between 2018 and 2025, we compute RPI ratings from regular-season games and examine their efficacy for predicting the outcomes of playoff games. We compare the accuracy of RPI predictions to those derived from a linear model analysis of game outcomes. The linear model we consider uses margin of victory for each regular-season game as the response variable. The mean response is assumed to be a linear combination of a home-field-advantage parameter (for games played at non-neutral sites) and a difference of team-strength parameters. We estimate the home-field-advantage parameter and the strength parameter for each team using both classical and Bayesian strategies. We find that our linear-model-based predictions (using either estimation strategy) perform similar to RPI-based predictions when it comes to picking winners of playoff games. The linear model analysis, however, provides additional information beyond RPI. For example, the linear model analysis provides win probabilities, prediction intervals for margins of victory, and tests for differences in team strengths. We illustrate the capabilities of the linear model approach with a variety of examples.

15. Predicting NFL Game Outcomes with Elo Ratings

Nicholas Pfeifer, University of Connecticut

Abstract: Predicting NFL games on a week-to-week basis can be very challenging. Fans and commentators alike enjoy sharing their picks and participating in competitions as well as gambling markets. The question I sought to answer was whether an Elo Ratings System could accurately pick the winning team. Elo Ratings are commonly used to rate chess players, and the basic idea is that a team's rating increases when they win and decreases when they lose. An Elo Ratings System includes an important hyperparameter called the K-factor which determines how much a team's rating fluctuates from game to game. Other Elo Ratings Systems for NFL games have been created in the past; however, the methodology for choosing the value for the K-factor is usually just selecting a number out of thin air. In this project I pinpointed the optimal K-factor for NFL games using a grid search approach with the Brier score acting as the loss function. The process involved gathering data from over 14,000 NFL games, creating the Elo Ratings System, and tuning for the optimal K-factor. The culmination of this work can be seen on the website footballelo.com.

16. Performance of Elo and Glicko under Rating-Based Matchmaking

Max Wang, Hopkins School; Bohan Yan, Glastonbury High School

Abstract: Rating systems such as Elo and Glicko are widely used to estimate competitor strength from head-to-head game outcomes. Most empirical evaluations of these systems rely on simulated tournaments with random pairings and assess performance primarily through predictive accuracy. However, many real competitive environments match players with opponents of similar estimated strength, creating a matchmaking mechanism that alters the information available for rating updates. The goal of this study is to quantify how such matchmaking structures affect the behavior and evaluation of rating systems. We construct a simulation framework in which match outcomes are generated from latent player strengths while opponents are selected according to current estimated ratings within a matchmaking window. Two complementary simulation studies are conducted. The first tracks the estimation trajectory of a focal player with fixed latent strength under skill-based matchmaking. The second evaluates population-level performance by examining recovery of the global ranking and predictive accuracy across the full player pool. Monte Carlo experiments comparing Elo and Glicko updates illustrate how conclusions about rating-system performance depend critically on the pairing mechanism and the chosen evaluation criterion.

17. Strategic Effects of the Power Play Rule in Mixed Doubles Curling: A Risk–Impact Analysis

Sungmin Park, Yonsei University; Yein Son, Yonsei University; Minjae Lee, Yonsei University

Abstract: The Power Play rule in mixed doubles curling was introduced to promote offensive play, yet its strategic effects remain largely unexplored from a data-driven perspective. This study examines how Power Play reshapes shot selection and outcomes using a unified risk–impact framework that decomposes strategy into intrinsic shot potential and realized execution. Analyzing 26,370 shots from international competitions between 2022 and 2025, we quantify shot impact through scoring and spatial components and define task-level risk using entropy-based outcome uncertainty. Our results show that Power Play systematically shifts shot selection toward higher-impact tasks, particularly in non-hammer and favorable score contexts. However, this increase in expected impact is accompanied by elevated risk and does not consistently translate intohigher realized impact. These findings suggest that Power Play primarily influences strategic intent rather than guaranteeing improved execution outcomes, underscoring the importance of considering contextual risk–return trade-offs in curling strategy.

18. Do Collegiate Athletic Programs Follow Scaling Laws? A Machine Learning Analysis of Spending and Men’s Basketball Success in NCAA Division I

Xiaohan Ye, University of Michigan

Abstract: Despite billions of dollars invested annually in college athletics, it remains unclear whether increased spending produces proportional improvements in competitive success. This study investigates whether NCAA Division I athletic programs exhibit scaling-law-like relationships between financial investment and on-court performance.

We construct a multi-year panel dataset linking institutional financial records from the Knight-Newhouse College Athletics Database with men’s basketball performance statistics from Sports Reference College Basketball (SRCBB), covering NCAA Division I programs from 2016–2024 seasons.

Using log–log regression models alongside cross-validated machine learning methods, we evaluate whether increases in athletic spending translate into proportional gains in season wins and win percentage.

Descriptive analysis reveals substantial financial stratification across collegiate athletics, with Football Bowl Subdivision (FBS) institutions operating at dramatically higher spending levels than Football Championship Subdivision and non-football programs. While higher spending is positively associated with competitive success, the relationship is highly heterogeneous. Machine learning results indicate diminishing marginal returns: predicted win percentages increase rapidly at lower spending levels but plateau among the highest-spending programs. Permutation importance analysis further shows that competitive context, particularly strength of schedule, is the dominant predictor of performance.

Overall, the findings suggest a sub-linear scaling relationship between athletic spending and competitive success, where additional financial investment yields progressively smaller competitive gains, providing empirical evidence of diminishing returns in collegiate athletic spending.

19. A Win Probability Strategy for the Power Play in Mixed Doubles Curling

Trey Elder, University of Pennsylvania; Samen Hossain, University of Pennsylvania; Jonathan Pipping, University of Pennsylvania

Abstract: We were selected as a data challenge finalist:

Mixed doubles curling includes a once-per-game Power Play option for the team with last-stone advantage (the hammer). When a team calls a Power Play, the two pre-positioned stones are moved to the side, opening up one half of the sheet. Because the Power Play is scarce and context-dependent, its value is mostly determined by when it is used rather than its average point-scoring increase. We build a win-probability-aware deployment policy by modeling a match as a finite-horizon Markov decision process (MDP) with a decision at the start of each end. Transition probabilities are estimated by a distributional expected-points model that predicts the full conditional distribution of end score differentials from decision-time features (end, score differential, hammer possession, Power Play usage, and relative team strength). Solving the MDP by backward induction yields an optimal stopping-style policy: when holding the hammer with Power Play available, use it if and only if the win-probability gain from using now exceeds the value of saving it for later. The resulting heatmaps provide practical guidance by end and score differential, with separate recommendations depending on whether the opponent has already used their Power Play. The largest gains appear in late-game, small-deficit states, where Power Play turns low-probability comeback paths into higher-leverage multi-point opportunities.

20. Optimal Timing of the Power Play in Mixed Doubles Curling

Aaron Lin, Ladue Horton Watkins High School

Abstract: In mixed doubles curling, the Power Play can be used once per game when a team has hammer, making its timing a key strategic choice. Using end-level data from 344 matches, we model Power Play timing as a sequential decision problem with stochastic scoring outcomes. A dynamic programming framework combined with Bayesian estimation is used to compute win probabilities and optimal Power Play policies across game states. We find that teams systematically delay Power Play usage relative to model recommendations, especially when trailing early, despite many recommendations being robust to data uncertainty.

21. Analyzing Optimal Performance Metrics in Senior Pickleball Duos Using Machine Learning

Shaurya Madiraju, Northern Valley Regional High School at Demarest

Abstract: Pickleball is one of the fastest growing sports in America, yet research on one of its largest demographics, seniors, remains limited. This study identifies performance metrics that differentiate winning and losing senior pickleball duos in the 2022, 2023, and 2024 US Senior Pickleball Tournaments. Exploratory Data Analysis and feature engineering were used to create new metrics such as points allowed per game to analyze and visualize the data. These efficiency and performance metrics were compared alongside win outcomes to reveal trends. After the data were finalized, they were passed into two different Machine Learning models for training: Logistic Regression and Random Forest, which would then reveal the relative importance of each metric. Results showed the importance of defensive efficiency, specifically, points allowed per game, as a strong indicator of success in senior pickleball duos. Future studies should focus on testing additional Machine Learning models to see if results vary, and use other features such as partnership history to further understand factors that impact winning chances among senior pickleball duos.

22. A Two-Stage Approach to Quantifying Scoring Talent in the NBA

Alex Susi, Georgia Institute of Technology

Abstract: A cliché said about the National Basketball Association (NBA) is that it is a “make-or-miss” league. But not all makes (or misses) are created equally. When discussing individual player scoring, common metrics include points per game, field goal percentage, effective field goal percentage, or true shooting percentage. Accurately evaluating NBA scoring requires separating what a player can control (shooting ability) from context (shot difficulty). Leveraging publicly available NBA data, we develop a two-stage modeling pipeline that decomposes field goal outcomes into context-driven difficulty and player/defender-specific effects, applied to over 200,000 field goal attempts from the 2024–25 NBA regular season.

In Stage 1, we train 3 distinct XGBoost classifiers (rim, mid-range, and three-point shots) on context-only features to estimate expected field goal percentage (xFG). Modeling each shot family separately captures the distinct difficulty profiles of each shot type. Model evaluation demonstrates that context explains most of the variance at the rim, while leaving more player-to-player variation in mid-range and three-point jumpshots.

In Stage 2, a Bayesian hierarchical logistic regression model with random effects for shooters, individual defenders, and defensive teams is fit using Stan, with the XGBoost-derived xFG as a baseline. This model yields posterior distributions on player shooting ability, individual defender impact, and team defensive scheme effects with full uncertainty quantification. These estimates are then aggregated into Expected Points Above Average (EPAA), a per-shot value metric designed to isolate individual contribution from shot context.

23. Power Play or Power Mistake: Strategic Decision-Making in Mixed Doubles Curling

Thomas Reedy, Indiana University Indianapolis; Nathan Ensley, Indiana University Indianapolis

Abstract: Motivation: Mixed doubles curling includes a unique strategic element known as the power play, which allows the team with the hammer to reposition the pre-positioned stones to one side of the sheet once per game, creating a more favorable scoring setup. Teams commonly treat the power play as a late-game “secret weapon,” often delaying its use until the final ends of a match. Because power plays can significantly affect scoring outcomes and win probability, understanding when and how they should be used can be critical at international competitions with a gold medal on the line. We challenged whether conventional strategic wisdom surrounding power play timing and hammer retention is supported by historical data and developed simple, evidence-based rules of thumb that players can apply during games.

Innovations: To conduct this analysis, we used historical data from international competitions and applied several statistical methods and models. Bootstrapped simulations of 10,000 games under different conditions tested whether intentionally conceding the seventh end to retain the hammer for an eighth-end power play improved win probability. A state-by-state model was created to evaluate whether using the power play given a specified end and margin of victory improved win probability. Additionally, logistic regression models and descriptive analytics analyzed what early-end shot selection improved the likelihood of scoring three or more points.

Results: The results show that although power plays modestly increase scoring potential, teams frequently delay using them until it is too late to meaningfully improve their chances of winning. Historically, approximately two-thirds of power plays occur in the sixth or seventh end. However, analysis of game states suggests that teams trailing by multiple points would benefit from deploying the power play earlier, before their win probability drops dramatically. Simulation results also found no convincing evidence that intentionally losing the seventh end to retain the hammer provides a strategic advantage. Finally, shot-level analysis revealed that higher scoring power plays occur when more stones remain in play, particularly when hammer teams prioritize draw shots rather than takeouts in key early shots.

Contributions and Future Research: This study successfully identifies patterns in power play usage and demonstrates that several common strategic assumptions may not maximize competitive outcomes. From these findings, we propose three practical strategies for mixed doubles curling teams: do not intentionally sacrifice an end to retain the hammer, use the power play earlier when trailing to increase comeback potential, and prioritize draw shots to load the house and improve the likelihood of scoring three or more points. A key limitation and future opportunity of this study is the scarcity of early-end power plays in historical data. This restricted our ability to precisely determine the optimal timing for deploying power plays, but can be discovered if more varied data can be procured. This will also open up future analysis in how shot selection can be influenced by game situation and play tendencies.

24. Forecasting UFC Fight Outcomes with Machine Learning

Sophia Manodori, Boston University

Abstract: Mixed Martial Arts (MMA) is a sport full of knock-outs, surprise submissions, and controversial judges decisions. In my project, I used various machine learning methods to predict the win/loss outcome of UFC fights using a dataset of all main event fights between 1999-2024. I compared the performance of discriminant analysis, trees and random forests, boosted trees, artificial neural networks, and logistic regression. My final model is used to create bootstrap confidence intervals for predicted win probabilities for upcoming UFC fights.

25. Arsenal+: Quantifying the Effects of Pitch Arsenal Diversity

Theo Au-Yeung, Rice University

Abstract: This study introduces a two-part framework to quantify pitching strategy beyond isolated pitch design. Our Arsenal+ metric measures how a repertoire spans velocity and movement space using normalized Shannon entropy to quantify volume and unpredictability. Building on this, we developed ArsenalFX, a pitch-level metric isolating the marginal contribution of arsenal diversity to effectiveness by comparing performance predictions from XGBoost models with and without repertoire-specific features.

Analysis of MLB data demonstrates that arsenal effects vary widely; certain offerings gain over 30 points of value solely through strategic contrast. Results reveal that raw diversity does not guarantee success. We identify pitchers with elite Arsenal+ scores but negative ArsenalFX, indicating suboptimal separation. Conversely, pitchers with modest diversity achieve high ArsenalFX via strategic pitch-plane separation. Feature importance confirms that pitch-specific distance metrics contribute more to accuracy than global diversity measures. We conclude that effectiveness requires constructing arsenals where offerings occupy distinct perceptual spaces to inhibit hitter decision-making.

26. Valuing NBA Draft Picks Using a Markov Model

Costas Dhimogjini, Bentley University; Yi Wen Huang, UT Austin; Jackson Lautier, Bentley University Professor

Abstract: National Basketball Association (NBA) draft picks are among the most important and uncertain assets in professional basketball. The draft is the primary path to acquiring long-term franchise players, making correct valuation critical. Yet draft picks are difficult to value because outcomes depend on uncertain factors such as lottery results and the productivity of the selected player. As trades increasingly include protected picks, swaps, and multi-year considerations, teams lack a consistent framework to evaluate the expected value and risk of these assets. To address this, we develop an empirical framework that models NBA team performance as a 30-state Markov process, where each state represents a team’s end-of-season ranking. Using historical standings, draft outcomes, and salary data from 2004–2025, we estimate transition probabilities and link draft positions to realized player value. Combined with NBA lottery rules, the model generates distributions of future draft outcomes. We illustrate the approach using the Desmond Bane trade, which included a 2029 first-round pick swap (top-two protected).

27. Upset Probability Index (UPI): A Context-Aware Model for Predicting Tennis Match Outcomes

Jerry Cheng, Central Bucks East High School

Abstract: Predicting tennis match upsets is challenging when players have similar Universal Tennis Ratings (UTR). While UTR reflects long-term skill, it does not capture short-term factors that often influence close matches, such as momentum, confidence, and match workload. This study investigates whether incorporating short-term performance indicators can improve upset prediction beyond rating difference alone.

Using 101 closely matched high school and USTA junior tournament matches from March–October 2025 (|ΔUTR| < 1.0), I developed the Upset Probability Index (UPI), a multivariable logistic regression model combining UTR difference with three contextual features: win-streak difference, a 14-day match-load metric representing fatigue, and a confidence indicator derived from recent match results.

UPI achieved substantially better predictive performance than both a UTR-only baseline and a dynamic Elo rating model (AUC: UPI 0.796 vs. Elo 0.643 vs. UTR 0.603; Brier: 0.181 vs. 0.235 vs. 0.236). Time-based validation on later matches showed similar performance, suggesting the model generalizes to unseen data. These findings indicate that short-term performance dynamics meaningfully influence outcomes in closely matched competitions.

28. The Association between Team Salary and Success across MLB, NBA, NFL, and NHL

Benjamin Mayer, Wesleyan University

Abstract: Team Salary in sports is debated to have a major impact on success. This study examines the relationship between Team Salary and Winning Percentage from 2021-2024. The analysis uses data collected from Spotrac and Sports Reference. This study also uses fixed effects in teams, years, and leagues to evaluate the relationship across the four Major Sports across the last four years. The results show a significant positive relationship between log salary and winning percentage and decreases in winning percentage across the years. Further research could look at more years and consider other factors that might contribute to winning percentage.

29. Imagining the Intended Shot: Separating Decision-making from Execution with Physics-based Monte Carlo Simulation for Curling

Andrew Kang, Carnegie Mellon University; Priya Narasimhan, Carnegie Mellon University

Abstract: Modern curling strategy is built around marginal advantages; shot evaluation is how we detect and attribute those margins. However, curling shot evaluation is difficult because what we observe is a mixture of the shot decision (intention) and the shot execution. Standard “task” labels do not fully capture the shot decision, and subjective quality “points” labels entangle decision quality with execution quality. We present an evaluation framework that disentangles and measures these components specific to each scenario, rather than opaque holistic ratings. First, we learn xScore, a state-value model that maps a static context (e.g. stone positions, throw order) to its expected end score differential, providing a curling analogue of xG or expected points from other sports. Second, we estimate each shot’s release parameters with an inverse solver on a physics simulator, augmenting the dataset of observed states with the inferred shots that must have occurred in between. Third, we generate counterfactual (what-if) outcomes via Monte Carlo simulation in two modes: local exploration around the actual shot and global exploration across all possible shots. These counterfactual distributions yield simple, interpretable per-shot metrics (decision value, decision risk, and execution value) that match coaching language and generalize to newer curling formats and rule changes.

30. Return Yards Positively Correlate with Offensive Efficiency in NFL Running Backs

Lewis Milun, University of Connecticut; Dr. Jun Yan, University of Connecticut

Abstract: Evaluating running back performance remains a critical challenge in NFL roster construction, as teams seek any advantage in identifying the players who maximize their contributions to their team. This study examines the relationship between a player’s overall efficiency, measured by yards per touch, and their output as a returner, measured by total return yards. Using a sample of all NFL running back seasons from the 21st century, linear regression analyses revealed a statistically significant positive relationship between a player’s involvement in the return game and their productivity on the offensive end. Descriptive analysis further supported this conclusion, finding that 58.7% of players who returned kicks throughout their career experienced their career highs in return yards and yards per touch in the same season. These results suggest that return involvement is associated with offensive performance and can serve as a useful signal in evaluating running backs. Incorporating return data into evaluation may help teams identify players who are more likely to be efficient, informing decision making during free agency and the draft.

31. Quantifying Conferenece Playing Identity and Player Deviation in NCAA Division 1 Hockey

Ethan Long, University of Conecticut

Abstract: Conference affiliation in NCAA Division I hockey is often associated with distinct playing styles, yet these differences are rarely quantified in a unified statistical framework. This study develops a set of metrics to characterize conference-level playing identity using team performance summaries and event-based features. Building on these aggregates, we define player-level deviation measures that capture how individual players align with or diverge from their conference’s typical style. The methodology integrates multivariate summaries with visualization tools, including radar-based representations, to facilitate comparison across conferences and teams. Results indicate clear clustering of conferences by stylistic profiles, reflecting systematic differences in pace, offensive emphasis, and defensive structure. At the player level, substantial heterogeneity is observed, with some players closely conforming to conference norms while others exhibit distinct roles that deviate from these patterns. These findings provide a quantitative basis for understanding stylistic identity in collegiate hockey and offer practical insights for scouting, player evaluation, and strategic decision-making.