Training Workshops

Six 50-minute workshops are scheduled in three parallel tracks.

Session A (Introductory Level)
Session B (Intermediate Level)
Session C (Advanced Level)

Introduction to R

Outline

R is an open-source programming language for statistical computing and graphics. R offers a wide range of graphical and statistical tools, including time-series analysis, classification, clustering, and linear and nonlinear modeling. This workshop introduces R to those who have had little to no prior experience. Topics include: 1) an overview of basic R; 2) data structure of R; 3) data management of R; and 4) some useful package of R. A real-life sports dataset will be used to provide a better understanding of R.

Instructor

Fusheng Yang Fusheng Yang is a fourth-year Ph.D. student in Statistics at UConn. Her research interests include time series analysis and extreme value analysis. She is a research assistant at the Computational Climate Change Lab, UConn, working on detecting heatwaves in the past and predicting possible heatwave events in the future. She is also a teaching assistant for various undergraduate statistics courses at the Department of Statistics.

Prerequisites

A laptop with R/RStudio installed; previous experience using R is NOT required; basic programming knowledge would be helpful but NOT required.

Training Materials

On GitHub.


Introduction to Python

Outline

As a popular high-level language, Python has many excellent features that data scientists like: easy to learn, object oriented, cross-platform, open source, and many extensions for machine learning. It is widely used in many data science challenges from the front end to the back end. Good Python programming makes it easier to analyze sports data. The workshop will cover the following contents in-class: Python data types, methods for moving data, and functions.

Instructor

Charitarth Chugh Charitarth Chugh is a second year undergraduate student majoring in Computer Science at the University of Connecticut. This year, he is leading deep learning projects as the Secretary of UConn Artificial Intelligence Club and creating open-source utilities.

Prerequisites

A laptop with Anaconda installed. Instructions to install Anaconda can be found here.

Training Materials

On GitHub.


Hockey Analytics

Outline

The hockey industry has become increasingly obsessed with analytics over the past twelve years, but it has not yet reached the level of sophistication and support that other sports sectors, like baseball and basketball, have attained. As such, in this workshop, you will learn to address these setbacks by collecting and processing data as well as analyzing and visualizing findings using powerful data science tools offered by the Python scripting language.

Instructor

Venkata Patchigolla Venkata Patchigolla graduated from UConn with a BS in Molecular and Cell Biology and is currently pursuing biomedical research. During the previous academic year, he served as president of the UConn Data Science Club, where he and his team created a series of workshops on data science fundamentals for their members.

Prerequisites

Familiarity with Python and access to Jupyter Notebooks. Recommended to have some familiarity with hockey.

Training Materials

On GitHub.


Baseball Analytics with Python

Outline

Throughout the past decade, baseball has evolved to adapt to the digital age. The mass influx of data within the 21st century has given fans of America's pastime the ability to analyze and understand the game in ways previously thought impossible. Yet, understanding how to attain and utilize baseball data is just as complex as the sport itself. During this workshop, you will use Python to learn essential baseball analytics tools, the concept of Sabermetrics, data visualization for baseball analytics, and more.

Instructor

Patrick Cummins Patrick Cummins is currently a third-year undergraduate student at UConn studying data science with a concentration in statistics. This past summer, he was a data analytics intern for Major League Baseball where he worked within multiple different analytics teams to enhance fan experience when using MLB products.

Prerequisites

Familiarity with Python, a general understanding of baseball, access to Jupyter Notebooks or Google Collab.

Training Materials

On GitHub.


Web Scraping for Sports Data

Outline

If you want to start analyzing sports you need data. Nowadays there are many sources of pre-built datasets, but would it not be wonderful if you could create your own datasets? Web scraping is the most effective solution to this problem. It allows you to create automated scripts that can very quickly and efficiently gather all sorts of data from any webpage. In doing so, you can create datasets specific to the questions that you want to be answered. During this workshop you will learn 1) a general web scraping pipeline, from there we will 2) work through static web scraping using Python packages pandas, requests, and BeautifulSoup, then 3) we will go through dynamic web scraping using Python packages Selenium and sqlite3, and we will finish off by talking about the 4) legality of web scraping.

Instructor

Hari Patchigolla Hari Patchigolla is a third-year undergraduate student majoring in Computer Science with a concentration in Computational Data Analytics at UConn. He previously worked as a Data Science intern for Optum where he mined through large datasets to identify different types of leads. Hari is also the current Vice President of UConn’s Data Science Club where he hosts various workshops.

Prerequisites

An understanding of or previous experience with Python; a laptop with access to Jupyter Notebooks or Google Collab.

Training Materials

On GitHub.


TensorFlow in Sports Analytics

Outline

This workshop will demonstrate creating a classification model using tensorflow in order to make better predictions for NFL Fantasy Football. Tensorflow is an open source machine learning framework that is used to run machine learning models and create deep neural networks with high-level Python code. In this workshop we will 1) introduce tensorflow and how it compares to other ML libraries, 2) introduce the conceptual foundations behind tensorflow, 3) provide a brief introduction to fantasy football, 4) provide a game plan for the creation of the model, 5) show and breakdown the code for the model.

Instructor

Pranav Tavildar Pranav Tavildar is a fourth year undergraduate at UConn pursuing a dual degree in Computer Science and Data Science. This year, he's leading a student-organized software development project as the President of the UConn Data Science Club.

Prerequisites

An intermediate understanding of python and a general understanding of linear algebra are recommended in order to make the most of this workshop. Being a New England Patriots fan is not required but strongly recommended for this workshop and life in general.

Training Materials

On GitHub.