UCSAS 2024 USOPC Data Challenge

The submission has been closed on January 15, 2024.

FInalists

(Alphabetical order in team names)

  • High school/undergraduate division
    • Team Blue Devil Statistics Magicians of Duke University: Benjamin Thorpe (captain); Sean Li; Christopher Tsai
    • Team ISM of Montgomery Blair High School: Isabella Yang (captain); Sophia Zeng; Michael Wang
    • Team MaJiC of Yale University: Mofeed Nagib (captain); Carly Benson; Jason Chu
  • Graduate division
    • Team David-Siddharth-Abby of Yale University: David Metrick (captain); Siddharth Chandrappa; Abby Spears
    • Team Lightbulb of Yale University: Sahil Singh (captain); Raymond Lee
    • Team Trojans Fighting of University of Southern California: Shilong Ren (captain); Xiaolu Xu

Table of Contents

  1. Overview
  2. Data
  3. Submission
  4. Eligibility
  5. Judging Criteria
  6. Terms of Participation
  7. Prizes
  8. Webinar Introduction
  9. Important Dates
  10. Qualifiers for Paris 2024
  11. Q&A Summary

Overview

For this data challenge, your goal is to identify the group of 5 athletes who will enable the USA Olympic Men’s and Women’s Artistic Gymnastics teams to optimize success in Paris 2024. You are tasked with developing an analytics model that can be used to identify and compare the expected medal count in the 8 medal events for men (team all-around, individual all-around, floor exercise, pommel horse, still rings, vault, parallel bars, and high bar) and 6 medal events for the women (team all-around, individual all-around, vault, uneven bars, balance beam, and floor exercise).

In total, 192 artistic gymnasts will compete at Paris 2024: 96 men and 96 women. Team events will feature 12 teams of 5 athletes each in the men’s and women’s events.* For countries that do not win a full team entry for Paris 2024, a maximum of 3 individuals per country will be able to qualify. Those remaining 36 entries for each gender will be determined by results of the 2023 World Championships, the 2024 World Cup Series, and the 2024 Continental Championships.

For each gender, 3 teams qualified at the 2022 World Championships in Liverpool, England. In the men’s competition, teams from China, Japan, and Great Britain qualified. In the women’s competition, teams from the United States, Great Britain, and Canada qualified. For each gender, 9 other countries will qualify teams based on their placements at the 2023 World Championships in Antwerp, Belgium, which will take place from September 30, 2023, and October 8, 2023.

*Note that at Tokyo 2020, men’s and women’s teams were composed of 4 athletes. Countries with full teams could also qualify up to 2 additional athletes to compete in Tokyo as individuals. The Team USA men had a 4-person team of Brody Malone, Sam Mikulak, Yul Moldauer, and Shane Wiskus. The USA men had one individual athlete, Alec Yoder, who could compete in the qualifying round at the Olympics to potentially qualify for the individual all-around and the apparatus finals. Similarly, the Team USA women had a 4-person team of Simone Biles, Jordan Chiles, Sunisa Lee, and Grace McCallum. The USA women had two individual athletes, Jade Carey and MyKayla Skinner, who could compete in the qualifying round at the Olympics to potentially qualify for the individual all-around and the apparatus finals.

According to recent news, gymnasts from Russia and Belarus will be allowed to take part in sanctioned competitions as “individual neutral athletes” from the start of 2024. Based on that information, Russia and Belarus will not be able to qualify for team artistics gymnastics events in Paris, but they would each be able to qualify up to 3 athletes to compete as individuals, as described above.

The problem of choosing a team is complicated by the structure of the Olympic competition. For both the men and women, the athletes first compete in a qualifying round in which their scores are used to determine advancement of teams (i.e,, countries) to the team all-around final and individuals to the individual all-around final and apparatus finals. In qualifying, 4 of the 5 athletes on each team compete on each apparatus, so not every athlete will compete in all apparatus. The athletes representing countries who did not qualify a full team may participate on all apparatus in the qualifying round.

The top 8 teams in qualifying advance to the team final based on the sum of the top 3 out of 4 scores on each apparatus (“4 up, 3 count”, for a total of 18 scores for men and 12 scores for women). Athletes must compete on all apparatus in qualifying to be eligible for the individual all-around final. The top 24 athletes qualify for the individual all-around final, with a maximum of two gymnasts per country. The top 8 athletes on each apparatus qualify for the final in that apparatus, again with a maximum of 2 gymnasts per country.

In the team all-around, individual all-around, and individual apparatus finals, all athletes’ scores from qualifying are thrown out. In the team all-around final, the medalists are determined by the sum of the 3 scores on each apparatus (“3 up, 3 count”, for a total of 18 scores for men and 12 scores for women). An athlete’s scores in the team all-around final have no effect on their scores in the individual all-around final, and in the individual apparatus finals, all previous scores from qualifying, the team all-around final, and the individual all-around final are thrown out.

The goal of this data challenge is to put together the best possible men’s and women’s Team USA Olympic Artistic Gymnastics teams, however, the meaning of “best” is left open to the interpretation of the entrant.

For example, a few questions that students could consider for their entry include:

  • How would your recommended team differ if you are trying to maximize total medal count, gold medals, or a weighted medal count (e.g., 3 for gold, 2 for silver, 1 for bronze)? How would your recommended team change if you consider a team all-around medal to be more valuable than the individual all-around medals and/or if you consider the individual all-around medals to be more valuable than the individual apparatus medals? Can Team USA maximize its total medal count by selecting a team of 5 gymnasts who are all-around gymnasts, event specialists (gymnasts who focus on 1 or more apparatus but not all apparatus), or a combination of those? Under what circumstances can Team USA maximize its total medal count by selecting a gymnast who only competes on 1 apparatus (e.g., Stephen Nederoscik, 2021 pommel horse World Champion)?

Data

Data from major domestic and international gymnastics competitions from the seasons leading up to the 2020 Tokyo and 2024 Paris Olympics will be provided to entrants. Because the Code of Points scoring system is changed each Olympic cycle, data from the years (2017-2021) leading up to the 2020 Olympics, which actually took place in 2021, should be used as a separate data set from the data from competitions in the 2022 and 2023 seasons leading up to the 2024 Olympics.

The cleaned data is on GitHub, which is being actively updated. The last update of data will be the 2023 World Artistic Gymnastics Championships in Belgium, which ends in early October, 2023.

Entrants can use any additional data that they choose as long as that data is publicly available, but the additional data need to be included in the final submission for reproducibility

Submission

Please submit here: Submission Link!

Students must submit a zip file containing:

  • A pdf report describing their results (max 3000 words)
  • A folder with
    • Documented code files
    • A README file describing what each file does.

Note: Students can include other files including any app code or supporting documents. Any code or apps included need to be self-contained and able to run on reviewers’ computers without modification.

Eligibility

The UCSAS 2024 Data Challenge is open to students only. You must be enrolled as a high school, undergraduate, or graduate student at some point during the 2023-24 academic year. Participants must register using their school email address.

Teams must enter one of the following two tracks:

  • High School / Undergraduate Track
  • Graduate Track

To be eligible for the High School / Undergraduate ALL members of the team must consist of either high school and/or undergraduate students. Each team can have up to 3 members. The team captain, if 18 years old or over, should fill out the registration form for the entire team using their school email addresses. If all team members are under 18, a faculty advisor needs to be the point of contact and register for the team.

Judging Criteria

A panel of judges from across academia and the sports industry will judge your submissions based on the following:

  • How original is the analysis?
  • How applicable is the analysis?
  • How appropriate were the methods used?
  • How well did you communicate your findings? This includes both written text and visualizations. How did the use of facts, data-supported narratives, anecdotes etc. buttress your storytelling?

Terms of Participation

United States Olympic & Paralympic Committee IP (logos, terminology, etc.) may not be used in any manner, without prior written consent of the USOPC. USOPC terminology examples include but are not limited to: Olympic, Paralympic, Olympian, Paralympian, United States Olympic & Paralympic Committee (USOPC), Team USA.

UCSAS 2024 Data Challenge participants may not imply or publicize any relationship or association between themselves and the USOPC, Team USA and/or the Olympic and Paralympic Movements, without prior written consent of the USOPC.

For subsequent use of the project in personal portfolios and presentations other than those related to the UCSAS 2024 Data Challenge and the U.S. Olympic & Paralympic Data, Analytics & Technology Summit (for the Data Challenge winners), participants may anonymize the work product (for example, replace “USOPC” with “U.S. Sport Organization”). Federal law gives the USOPC extensive rights to control the use of USOPC IP in the United States and allows the USOPC to file a lawsuit against any entity using such IP without consent.

Note: The USOPC does not control the trademarks or logos owned by the National Governing Body (NGB) for a specific sport. For instruction on the proper use of those marks, please contact the relevant NGB directly.

Prizes

Finalists (Six teams: three high school / undergraduate and three graduate) will be invited to present their work at UCSAS 2024 in Storrs, CT, and will receive some travel support and have their registration fees waived. The winning teams will receive a cash prize (UCONN) and a plaque (UCONN). Additionally, one member of each winning team will have the opportunity to showcase their team’s work, with all travel, registrations, and expenses paid, at the 2024 U.S. Olympic & Paralympic Data, Analytics & Technology Summit in Colorado Springs, CO, providing them with increased exposure and potential opportunities for future collaborations. The runners-up will receive a certificate of achievement (UCONN) and recognition for their outstanding performance in the Data Challenge.

Webinar Introduction

Another webinar may be arranged depending on demand

Important Dates

  • Data challenge release: August 4, 2023
  • Webinar introduction: September 14, 2023
  • Participation registration: November 1, 2023; given the challenge's popularity, we need to know the number of participating teams to adequately prepare and recruit the necessary judges.
  • Submission deadline: January 15, 2024
  • Finalists notified: February 15, 2024
  • UCSAS 2024: April 12-13, 2024

Qualifiers for Paris 2024

The countries and individual gymnasts who have qualified for Paris 2024 are listed here: Men's Qualifiers, Women's Qualifiers. Quota spots can be qualified by country (NOC – National Olympic Committee) or by athlete name, as explained below:

Criteria 1 and 2: After the World Championships in early October 2023, all full teams of 5 gymnasts in Men’s (12) and Women’s (12) Artistic Gymnastics qualifying to the Paris 2024 Olympics were determined. The gymnasts who will compete for each qualified country will be selected by their countries’ gymnastics federations. Each country’s gymnastics federation has their own process for selecting their athletes.

Criteria 3: At the World Championships in early October 2023, for the Men’s and Women’s events, 3 countries who did not qualify full teams for Paris 2024 earned one quota place each. Each of those countries’ gymnastics federations will select the athletes for those quota places.

Criteria 4 & 5: At the World Championships in early October 2023, for the Men’s and Women’s events, several athletes who did not qualify full teams for Paris 2024 earned a quota place for themselves by name for Paris 2024.

Criteria 6: 12 Men and 8 Women from countries who did not qualify full teams for Paris 2024 and who did not earn a quota place for themselves through Criteria 4 or 5 will earn quota places for themselves by name through the 2024 Apparatus World Cup Series, which concludes in April 2024.

Criteria 7: The highest ranked Men’s and Women’s athlete in the All-Around Qualifications results or the All-Around Final results at each Continental Championships from a country who did not qualify a full team for Paris 2024 and did not earn a place for themselves by name through Criteria 4, 5, or 6 will earn a quote place for themselves by name.

At the Pan Am Games in October 2023, Audrys Nin Reyes (DOM) earned the Men’s quote place and Luisa Blanco (COL) earned the Women’s place for the Americas. These names are not yet indicated under Criteria 7, #1 PAGU on the attached lists. The other Continental Championships will take place in April and May 2024.

Host Country Place: If France does not qualify a Man to Paris 2024 through Criteria 6 or 7, the Host Country quota place will be allocated by name to Leo SALADINO, the highest ranked Men’s French athlete based on the All-Around Qualifications results of the 2023 World Championships. Because France qualified a full Women’s team to Paris 2024, the Host Country Place for Women has already been assigned to Rifda IRFANALUTHFI (INA).

NOTE: Any country that did not qualify a full Men’s team for Paris 2024 can qualify a maximum of 3 individual athletes each. The same rule applies to the Women. For example, the German(GER) Women have already qualified the maximum of 3 quota places—a quote place by country (Criteria 3), Pauline Schaefer-Betz (Criteria 4), and Sarah Voss (Criteria 5).

Q&A Summary

  1. Question: Where can we find and refer back to this PowerPoint after tonight's seminar finishes?
    Answer: The slides and video recording will be made available on the website.

  2. Question: What does the "all around" mean?
    Answer: In the team all around qualification, 4 of the 5 members of a team compete on each apparatus and the top 3 scores on each apparatus count. In the team all around final, 3 of the 5 members compete on each apparatus and all 3 scores on each apparatus count. In the individual all around, each of the 24 qualified athletes performs on each apparatus once.

  3. Question: Is the qualification to qualify for the Olympics or to qualify for the finals of the Olympics?
    Answer: The qualification is to qualify for the team all around final, individual all around, and event finals within the Olympic competition itself, not to qualify for the Olympics.

  4. Question: Will all data be posted on the website?
    Answer: All data for the competitions provided, with the last one being the 2023 World Championships and the 2023 Asian Games in early October, will be posted on GitHub. Additional publicly available data can also be used, but it must be included with the project submission.

  5. Question: If athletes have similar execution scores, difficulty scores, and performance history, does Team USA consider defending Olympians?
    Answer: For this analysis, probably not, but it’s up to participants if they want to try to incorporate it. There is nothing mentioned in the USA Olympic Selection Procedures for Artistic Gymnastics about consideration based on participation in previous Olympic Games.

  6. Question: In events with two vaults, how much time is allowed between competitors' vaults?
    Answer: There may be rules around timing, but it's generally dictated by judges being ready. Athletes are typically ready before judges give the signal to compete.

  7. Question: Should we develop a model to predict medal counts for combinations, or pick 5 specific athletes?
    Answer: Develop a model for combinations first, but also provide a rank order of combinations and the best team of 5.

  8. Question: Can the final submission content just be the model, or also the 5 athletes the model selects?
    Answer: Include all components - report, code, data. Highlight the optimal teams of 5 in the report.

  9. Question: Can teammates be from different schools?
    Answer: Yes.

  10. Question: Can there be multiple teams from the same university?
    Answer: Yes, no limit.

  11. Question: For missing scores, is it appropriate to replace with zeros?
    Answer: Missing scores likely mean the gymnast didn’t compete. If the gymnast, didn’t compete, delete those records rather than averaging in zeros. However, there are rare cases in which the gymnast competed and earned a score of zero, such as with Donnell Whittenburg on his Vault during the team all around qualification at the 2022 World Championships.

  12. Question: How are collegiate and elite scoring related?
    Answer: Very little relationship. Men's college uses elite scoring, but women's college is very different.

  13. Question: Should there be separate models for men and women?
    Answer: Same modeling approach, but analyze men and women separately.

  14. Question: Do all countries submit team members at the same time?
    Answer: No, each country determines their team at different times after their trials/nationals.

  15. Question: Should results include alternates?
    Answer: An element of originality if you can do it and it provides additional insight to your analysis.

  16. Question: Will the scores of the Asian Games be included in the dataset?
    Answer: Yes, the Asian Games taking place Sept 23 - Oct 8 will also be included.

  17. Question: What is D score and E score in the dataset?
    Answer: D score is the difficulty score, determined by the skills performed and their difficulty value. E score is the execution score, starting from 10 and with deductions for errors in form, landing, etc.

  18. Question: How is the originality of the analysis judged?
    Answer: What you have done that is different from the norm and demonstrates additional insight.

  19. Question: How do we communicate our results?
    Answer: Visualizations in code can be included, select key graphs for the written report.

  20. Question: I observed missing values in the dataset - how to handle?
    Answer: Missing scores likely mean no competition on that apparatus. Delete records rather than imputing zeros.

  21. Question: Do the extra 36 gymnasts qualify for specific apparatus and/or the all-around, and then only compete in those events for which they qualified?
    Answer: The extra 36 Men and the extra 36 Women, who will qualify for Paris through a variety of competitions and both all-around and individual apparatus events, may participate on all apparatus in the Qualifications in Paris. All 96 Men (60 on full teams + 36 extra) in Paris will compete in Qualifications on the first day of competition for Men’s Artistic Gymnastics (July 27, 2024). Likewise, all 96 Women (60 on full teams + 36 extra) in Paris will compete in Qualifications on the first day of competition for Women’s Artistic Gymnastics (July 28, 2024).

  22. Question: Do the 3 gymnasts in the team final for an apparatus have to be chosen from among the 4 that did the quals, or can they be any 3 of the 5 gymnasts on the team?
    Answer: Any 3 of the 5 gymnasts on the team can be chosen for an apparatus in the team final.

  23. Question: Can a country have all 5 gymnasts compete in the quals for an apparatus, even though only 4 count for the team event (maybe they are all good at bars and have a chance to compete for the apparatus finals)?
    Answer: No, only 4 of the 5 gymnasts from a country may compete on an apparatus in the quals. If an athlete is thought to have a chance to compete for the apparatus finals, the country would typically have that athlete compete in the quals to contribute to the team’s score and try to qualify for the apparatus finals.

  24. Question: So theoretically, a team could have one athlete sit the entire qualifiers and then that athlete could do all of the events in the finals provided the team qualified?
    Answer: Correct. The situation that came up in our discussions was a country might have three gymnasts do all events so they're eligible for the all-around. Those three gymnasts might be third fourth and fifth on their team in, say, floor. Presumably the fourth person to do floor would be the best on the team in floor, which means the second best in floor doesn't do floor in the quals.
    The situation described could actually occur. For exemple, the USA Women had 3 gymnasts (Aly Raisman, Gabby Douglas, Jordyn Wieber) do all events in qualifying, and they placed 2nd, 3rd, and 4th, respectively, in qualification for the Individual All-Around. Because of the 2 per country rule, Jordyn Wieber, the defending Individual All-Around World Champion, could not compete in the Olympic Individual All-Around.