For the past 15 years, pre-university students in many countries including the United States have encountered data analysis and probability as separate, mostly independent strands. Classroom-based research suggests, however, that some of the difficulties students have in learning basic skills in Exploratory Data Analysis stem from a lack of rudimentary ideas in probability. We describe a recent project that is developing materials to support middle-school students in coming to see the "data in chance" and the "chance in data." Instruction focuses on four main ideas: model fit, distribution, signal-noise, and the Law of Large Numbers. Central to our approach is a new modeling and simulation capability that we are building into a future version of the data-analysis software TinkerPlots. We describe three classroom-tested probability investigations that employ an iterative model-fit process in which students evaluate successive theories by collecting and analyzing data. As distribution features become a focal point of students' explorations, signal and noise components of data become visible as variation around an "expected" distribution in repeated samples. An important part of students' learning experience, and one enhanced through visual aspects of TinkerPlots, is becoming able to see things in data they were previously unable to see.
The CAUSE Research Group is supported in part by a member initiative grant from the American Statistical Association’s Section on Statistics and Data Science Education