
Objectives
Students will use data to create a scatter plot by hand and be able to understand the importance of replication and the intrinsic link between variability and the conclusions that can be drawn from data.
Overview
    Rating: 
            
     
- Students complete activity using provided worksheet, whereby they do several rounds of ‘sampling’ (drawing from cups) Hudson River fecal indicator data collected by Riverkeeper.
- Students answer reflective questions at the end of the worksheet.
- Optional: Students create a bar graph of the same data and compare and contrast these two different types of data displays.
Materials
- Black-and-white printed resources (see below)
- Pencil
- Colored pencils (yellow, orange, red, purple)
- Cups or envelopes
Procedure
Prepare: Print the provided ‘Data for printing’ sheet. You will need one per group—each page is one full set of data.
For each set (page) of data – 1) Cut out the data (one number/site per slip of paper) and group all data for each site. There are thin vertical gray boxes between the site name and the Enterococcus count; there are broad vertical gray boxes separating different columns of data. 2) Place the data for each site into a separate cup labeled for that site (e.g. “Battery – mid Hudson”).
Engage: Ask students: Is the Hudson River healthier in Manhattan or in Albany? Why? How do we decide whether water is healthy or not?
Explore: Students should assemble themselves or be assigned into groups of three. Pass out worksheets (one per student), graphs (one per group), and colored pencils.
Students then complete the sampling activity, as described in the worksheet. They will complete four rounds of sampling. In the first round, they will sample only 3 times from each site, using yellow pencils. They will then make an estimate of what they think the next sample would be (i.e. if they were to take another sample tomorrow). They decide which site they think has the most variability in the data and discuss and document their confidence in these answers.
The students then complete three more rounds of sampling, following the same steps above, but adding samples each time [Round 2 – draw 5 more samples (total 8); Round 3 – draw 7 more samples (total 15); Round 4 – draw 10 more samples (total 25)].
Finally, students answer the reflection questions on pages 3-4 of the sampling Activity worksheet. By the end of this process, students should develop an improved understanding of how science is conducted, the importance of replication for understanding variation in a system, and how this variation impacts the sorts of conclusions we can make.
Explain: Often, our estimates of ‘averages’ and ‘modes’ change as we collect more samples. Similarly, our confidence changes too. If we consistently sample a site and find very low variability in the sample values, we may quickly become confident that our next sampled value will be similar to those previously sampled. However, if we find a lot of variation in our samples, we may not feel very confident about what we think will be the value of the next sample drawn. Yet, as we collect more and more samples, we may gain confidence that the next value drawn will be within a certain range.
- Low variability in data = more quickly become confident in our estimates of ‘average’ and ‘mode’
- High variability in data = not as confident in our estimates, and we often need to collect many more samples to achieve a similar level of confidence in our estimates of ‘average’ and ‘mode’
Understanding the scientific process and how much confidence we ‘should’ have given a certain sample size and variation in the data is important when we use data as evidence for claims we make.  The students are asked to make claims about whether any of the sites differ in their Enterococcus counts, or whether they have similar counts. The student should use the sample variation as evidence for their answers.
Extend: Students can create a bar graph of their data (using data averages), and compare and contrast the information provided by each type of data representation. This is a good opportunity to introduce the concept of error bars to the students. Error bars can help us gain a better understanding of the variability in bar graph data. You can go more in-depth into this topic if desired (math courses, AP, etc).
Evaluate: Use student answers to the worksheet questions to assess student understanding.
Resources
Lesson Files
		pdf
	
ANOVA Sampling Activity Student Worksheet
		pdf
	
ANOVA Sampling Activity Data
		pdf
	
ANOVA Sampling activity Graph
  
Standards
Benchmarks for Science Literacy
1A Scientific World View, 1B Scientific Inquiry, 2A Patterns and Relationships, 6E Physical Health, 9D Uncertainty, 9E Reasoning, 10I Discovering Germs, 11D Scale, 12B Computation and Estimation, 12D Communication Skills, 12E Critical-Response SkillsNYS Standards
MST 1 - Mathematical analysis, scientific inquiry, and engineering design, MST 3- Mathematics in real-world settings, MST 4- Physical setting, living environment and nature of science, MST 7- Problem solving using mathematics, science, and technology (working effectively, process and analyze information, presenting results), ELA 1- Language to collect and interpret information and understand generalizations, ELA 3- Language for critical analysis and evaluation, ELA 4 - Language for communication and social interaction with a wide variety of peopleCredits
Data Source: The data for this activity were collected by G. D. O’Mullan, A. R. Juhl, and J. Lipscomb for Riverkeeper. They show fecal indicator bacteria (enterococci) at three sites along the Hudson River. The data shown represent individual samples collected at each of these sties between 2007-2013.
Data collected by O’Mullan GD, Juhl AR, and Lipscomb J, available at www.riverkeeper.org. Funding provided by Hudson Riverkeeper, the Wallace Research Foundation, the Brinson Foundation, Lamont Doherty Earth Observatory of Columbia University, and CUNY Queens College.
To view data from additional testing sites and to use the excellent data-visualization tool provided by Riverkeeper, please visit http://www.riverkeeper.org/water-quality/hudson-river/.
These data were provided courtesy of Riverkeeper. Please note their data use policy below for further use.
Riverkeeper's Data Use Policy:
Hudson River water-quality data posted to the Riverkeeper website are made freely available to the public, and we encourage their wide use. However, if you use the data for research, policy, or educational purposes, we would appreciate that you let us know so that we can document that use for our funders (which will help us continue this service).
Please do not post any data from the Riverkeeper website directly on any other website. However, linking to the Riverkeeper website, or to individual pages on the Riverkeeper website, is encouraged. If the Riverkeeper website data are used as background or ancillary information for any presentation, publication, website, or educational product, we would appreciate proper acknowledgement (Data collected by O’Mullan GD, Juhl AR, and Lipscomb J, available at www.riverkeeper.org. Funding provided by Hudson Riverkeeper, the Wallace Research Foundation, the Brinson Foundation, Lamont Doherty Earth Observatory of Columbia University, and CUNY Queens College).
If you would like to use the Riverkeeper website data as an integral contribution to any publication or educational product, please contact us to discuss potential collaboration and appropriate determination of authorship. Please contact us if needed to inquire about additional data that may be available and about QA/QC procedures. Thank you for respecting the efforts of many individuals that have gone into collecting, processing, maintaining, and disseminating these valuable data.
 
	     
      



 
      
 
      


















































 
      










 
 
