Skip to main content

About the Datasets

All datasets have been collected by professional scientists or research agencies who have kindly shared their data with Hudson Data Jam. Each dataset contains a link to a Google Sheet with the data and a brief introductory section describing the research.

You can view and sort the 30+ datasets at the bottom of this page.

For a Google Drive folder of sample graphs to guide student exploration, advisors can contact caryeducation@caryinstitute.org.

New This Year

  1. Want to use a unique dataset not listed in our collection? Email us a message (caryeducation@caryinstitute.org) that includes a link to the proposed dataset, or attach a file. Cary Educators will evaluate whether the dataset may be used in the competition.
  2. Some datasets were removed from Data Jam. In an effort to improve our collection and the student experience, we carefully selected and updated many of our datasets for the 2023 competition. This includes removing datasets that were either rarely used and/or needed substantial updates. However, if you notice a dataset missing that you had really wanted to work with, please email us! Your feedback is important.

Tips

How to work with a dataset: When you open a Google Sheet, click "File" in the upper left corner. From the drop-down list, either 1) Click "Download," which will download it as a Microsoft Excel file or 2) Click "Make a Copy" and save a copy of the Google Sheet to a personal Google Drive folder. Please do not "Request Access" to the dataset's Google Sheet. Thank you!

You can also work with select datasets within TUVA - see details below.

Tuva & the Hudson Valley Data Portal
We’ve created a partnership with TuvaLabs, Inc. to host many* of our Data Jam datasets on their interactive graphing platform. Students can drag and drop the variables right onto the axes and build graphs in seconds without the complexity of manipulating a spreadsheet. 

*Note: Many, but not all, of our datasets are on the TUVA platform. More of the Level 1 and Level 2 datasets are available on TUVA, and fewer Level 3 datasets. If a Data Jam dataset is available in TUVA, there will be a link to it on its associate webpage. 

Hudson Valley Data Portal

Explanation of Levels
Dataset levels are derived by looking at the number of factors in the dataset and by the sheer amount of data collected. We suggest that elementary students begin with Level 1 datasets, especially if it's their first competition. Most middle schoolers will be successful with a Level 1 or 2 dataset, and the appropriate level for your high schoolers depends on their data experience and determination. Drop us a line if you need help selecting an appropriate dataset for your student.

Level 1= Easy
Level 2= Moderate
Level 3=Challenging

Enhanced
Includes an additional PDF with background information and extra resources. These topics are a good starting place for students who are new to data analysis.

Filter: