Data Challenge

Overview

Curling is a sport where two teams take turns sliding stones down a sheet of ice towards a target. The most common form of the game features teams with four players, with each player getting two shots per round, known in curling as an “end”. When one player shoots, two of their teammates follow the stone down the ice, sweeping its path with brooms to direct it to the desired position. Teams score by having the stone closest to the center of the target at the completion of the end, one point for each stone inside of the opponent’s closest stone. After a set number of ends (usually 8 or 10), the team with more points is the winner.

A primer on curling rules (both 4-person and mixed doubles) can be found here.

Mixed Doubles Curling presents a unique analytics opportunity described as the "Wild West of Curling" due to its distinctive format and strategic complexity. Unlike traditional curling, Mixed Doubles features:

  • 8 ends total.
  • 2 pre-placed stones.
  • Power play. Each team is allowed to opt for a “power play” once per game when they have the hammer. When a power play is being used, the pre-placed stones are moved out to one of the sides. See the primer for an image depicting the pre-placed stones during most ends, and the pre-placed stones when a power play is used.
  • Limited existing analytics - most available data comes from the Curlit website
  • High strategic variance - teams show great disparity in how they use power plays
  • Real-world application - coaches actively need these insights for upcoming competitions

The full rules of all forms of curling can be found here.

This Data Challenge focuses on power play optimization - when to use it, how to execute it, and how to defend against it - using data from international competition results that show final stone positions.

Questions to Consider

Since mixed-doubles curling is fairly new, there are a variety of topics that have not previously been studied. Below are a few example topics. However, we note that this list is far from exhaustive. Participants should feel free to study any aspect of mixed-doubles curling that interests them, provided that the analysis would be useful to a coach, curler, or some other member of the curling community.

  • Strategic Timing
    • When is the optimal time to deploy the power play? At what score differential and in which end should teams use this advantage?
    • How does power play effectiveness vary by opponent? Can strategies be customized based on the opposing team's tendencies?
  • Shot Selection & Execution
    • What are the most effective opening sequences? Should the first shot come around the guard to the open side, or are there better approaches?
    • How do the first 3 shots correlate with end outcomes?
    • What's the relationship between shot quality and scoring? How often does drawing under a guard result in zero points scored?
  • Comparative Analysis
    • Why do some teams score significantly more on power plays? For example, what makes Great Britain (Bruce Mowat) more successful than Italy on power plays?
    • How often can teams realistically expect to score 3+ points? What conditions make big scores possible?
  • Defensive Considerations
    • How should teams defend against power play attacks? What defensive setups minimize opponents' scoring opportunities?
    • Risk vs. reward analysis. When should teams "punt" an end versus fighting for a steal?

Student Deliverables Could Include:

  • Power play timing optimization model
  • Shot selection probability calculator (i.e., likelihood of scoring or giving up 0, 1, 2, 3, etc., points in the end based on the current stone positions and probability of success of upcoming shot options)
  • Team-specific strategy recommendations (based on considering the teams who will be competing at the Olympics)
  • Defensive positioning guidelines
  • Performance benchmarking across international teams

Available Data:

Shot-by-shot stone positions from Curlit are available on Github for the following competitions:

  • Beijing 2022 Olympic Winter Games
  • World Mixed Doubles Curling Championship 2023
  • World Mixed Doubles Curling Championship 2024
  • World Mixed Doubles Curling Championship 2025

These data correspond to the shot-by-shot images that are found e.g. in this 2025 World Mixed Doubles Championship Results Book. See Curlit for results from other competitions.

There are 26,000+ observations in the data which correspond to one throw during one end of a game in one of the competitions listed above. The data are stored in six CSV files and the columns contain the following information:

Competition.csv:

  • CompetitionName – Name of the competition.
  • Place – City/Country where the competition was competed.
  • Venue – Venue where the competition was competed.
  • CompetitionID – Unique ID for the competition.

Competitors.csv:

  • CompetitionID – ID for the competition (from Competitions.csv).
  • TeamID – ID for the team. Note: Team IDs across different competitions are associated with nations, not necessarily the individuals on that team.
  • NOC – National Olympic Committee, the nation of the competing team.
  • Reportingname – Name of the competing athlete (LAST NAME First Name) .

Ends.csv:

  • CompetitionID – ID for the competition (from Competitions.csv).
  • SessionID – ID of the draw/round for the match.
  • GameID - ID for the match. Note: A unique ID for a match can be made from a combination of Competition, Session and Game IDs.
  • TeamID – ID for the team (from Teams.csv).
  • EndID – ID for an end of the match.
  • Result – The points scored by that team in the end.
  • PowerPlay – A flag for when a team uses their power play in an end. A value of 1 corresponds to pre-placed stones being moved to the right side, 2 being pre-placed stones are moved to the left side. An empty value means the team is not using their power play.

Games.csv:

  • CompetitionID - ID for the competition (from Competitions.csv).
  • SessionID - ID of the draw/round for the match.
  • GameID - ID for the match. A unique ID for a match can be made from a combination of Competition, Session and Game IDs.
  • GroupID – In competitions where teams are split into groups for play, the ID of the group for the teams in this match.
  • Sheet – The name of the sheet on which the match is played.
  • NOC1 – National Olympic Committee of the first competing team.
  • NOC2 - National Olympic Committee of the second competing team.
  • ResultStr1 – Final Score of Team 1.
  • ResultStr2 – Final Score of Team 2.
  • LSFE – Last Stone First End. This column indicates which team threw the last stone in the first end of the match. In curling parlance, this is called starting with “the hammer”. A 0 value means that NOC2 threw the last stone in the first end, a 1 means that NOC1 threw last.
  • Winner – Indicates the winning team. 0 indicates that NOC2 won the match, 1 indicates that NOC1 won the match.
  • TeamID1 – ID for the team corresponding to NOC1 and ResultsStr1 (from Teams.csv).
  • TeamID2 - ID for the team corresponding to NOC2 and ResultsStr2 (from Teams.csv).

Stones.csv:

  • CompetitionID – ID of Competition in which match was competed (from Competitions.csv)
  • SessionID – ID of the draw/round for the match.
  • GameID – ID for the match. A unique ID for a match can be made from a combination of Competition, Session and Game IDs.
  • EndID – The end in which the shot is taking place.
  • ShotID – ID for the shot. Within a particular match and end, shots occur in ascending order of the ShotID values. (i.e. lower ShotIDs are thrown first)/
  • TeamID – ID for the Team (from Teams.csv).
  • PlayerID – ID for the player taking the shot (1 or 2).
  • Task – The type of shot that the player is throwing, or the objective of the shot. Details to come.
  • Handle – The turn of the stone as it is thrown. A 0 value means the shot is turning clockwise, a 1 indicates counterclockwise.
  • Points – An assessment of the execution of the shot, ranging from 0 to 4. A 4-point shot is one that has been ascertained to have been perfectly executed to the player’s intention, while a 0 is a shot that totally failed in its intended result. Note that these points are not the same as points awarded to teams after ends, but simply an evaluation of a shot’s effectiveness.
  • TimeOut – Binary variable for whether a time out was called before the shot.
  • Stone_{i}_x – The x position of stone i on the sheet, after the stone for the row has been thrown. Stones can take x values in (0,1500). The value 4095 is a sentinel value indicating that the stone has been knocked off the sheet and is no longer in play. The value 0 indicates that the stone has not yet been thrown in this end.
  • Stone_{i}_y – The y position of stone i on the sheet, after the stone for the row has been thrown. Stones can take y values in (0,3000). Again, the value 4095 indicates the stones have been knocked off the sheet and aren’t in play, and the value 0 indicates the stones have not yet been thrown.

A Note on stones: Stone i does not necessarily correspond to ShotID i. The first six stones are the stones of the team that is throwing first in the end, while stones 7-12 are for the other team in that end. Stone_1 and Stone_7 are the pre-placed stones, and teams throw the rest of their stones in order. For example, a team going first in an end would have stones 2-6 and throw them in ascending order.

Teams.csv:

  • CompetitionID - ID for the competition (from Competitions.csv).
  • TeamID - ID for the Team (from Teams.csv).
  • NOC - National Olympic Committee, the nation of the competing team.
  • Name – Name of the country corresponding to NOC abbreviation.

More data from additional competitions may be given at later time if it becomes available.

Data Challenge Registration

Given the challenge's popularity, we would like to know the number of participating teams to adequately prepare and recruit the necessary judges. Please register for the data challenge by December 15, 2025 at:

Data challenge registration form

This registration is for the data challenge only. Registration for the conference will be available at a later date.

Submission

Students must submit a zip file containing:

  • A pdf report describing their results (max 3000 words). Please do not put the author names in the first page of the report, just the title. The project will be linked to the name/email address that you used to submit your report. The judges won't see that, only the organizers, so that the judging will be anonymized.
  • A folder with
    • Documented code files
    • A README file describing what each file does.
    • (Optional) If you used publicly available data that wasn't provided by the data challenge, please include it along with the code you used to obtain it. If those additional data files are too large to be uploaded here, you can upload them into a shared drive (e.g. Google Drive, OneDrive, Dropbox, Box, GitHub LFS), use a link for the shared file in your code, and make note of this in the README file. In other words, include whatever is necessary to make sure that your work is reproducible. Please do not include the data provided in the data challenge. We have that data already!
    • (Optional) Students can include other files (e.g. app code or other supporting documents) as well. Any code or apps included need to be self-contained and able to run on judges' computers without modification. Include these files in your .zip file submission if you have them.

Eligibility

The CSAS Data Challenge is open to students only. You must be enrolled as a high school, undergraduate, or graduate student at some point during the current academic year. Participants must register using their school email address.

Teams must enter one of the following two tracks:

  • High School / Undergraduate Track
  • Graduate Track

To be eligible for the High School / Undergraduate ALL members of the team must consist of either high school and/or undergraduate students. Each team can have up to 3 members. The team captain, if 18 years old or over, should fill out the registration form for the entire team using their school email addresses. If all team members are under 18, a faculty advisor needs to be the point of contact and register for the team.

Judging Criteria

A panel of judges from across academia and the sports industry will judge your submissions based on the following:

  • How original is the analysis?
  • How applicable is the analysis?
  • How appropriate were the methods used?
  • How well did you communicate your findings? This includes both written text and visualizations. How did the use of facts, data-supported narratives, anecdotes etc. enhance your storytelling?

Prizes

Finalists (Six teams: three high school/undergraduate and three graduate) will be invited to present their work at the 2026 CSAS and will receive some travel support and have their registration fees waived. Winning teams will have the opportunity to showcase their team’s work to data scientists and practitioners at the 2026 U.S. Olympic & Paralympic Performance Innovation Summit (location and date TBD but likely mid/late 2026). The winning teams will also receive a cash prize and a plaque.

Webinar Introduction/Office hours

An introductory workshop led by members of the CSAS organizing committee and members of the curling industry will be planned in late October or early November. We will give a short introduction and spend most of the time on Q+A.

  • Date/Time: TBD
  • Webinar Recording: coming soon!

Important Dates

  • Data challenge release: October 10, 2025
  • Webinar introduction: TBD. Recording coming soon!
  • Participation registration: December 15, 2025
  • Submission deadline: January 15, 2026
  • Finalists notified: February 15, 2026
  • 2026 CSAS: April 10-11, 2026