Chapter

Reasoning About Data Analysis

The purpose of this chapter is to describe and analyze the ways in which middle school students begin to reason about data and come to understand exploratory data analysis (EDA). The process of developing reasoning about data while learning skills, procedures, and concepts is described. In addition, the students are observed as they begin to adopt and exercise some of the habits and points of view that are associated with statistical thinking. The first case study focuses on the development of a global view of data and data representations. The second case study concentrates on design of a meaningful EDA learning environment that promotes statistical reasoning about data analysis. In light of the analysis, a description of what it may mean to learn to reason about data analysis is proposed and educational and curricular implications are drawn.
Learning To Reason About Distribution

The purpose of this chapter is to explore how informal reasoning about distribution can be developed in a technological learning environment. The development of reasoning about distribution in seventh-grade classes is described in three stages as students' reason about different representations. It is shown how specially designed software tools, students' created graphs, and prediction tasks supported the learning of different aspects of distribution. In this process, several students came to reason about the shape of a distribution using the term bump along with statistical notions such as outliers and sample size.<br>This type of research, referred to as "design research," was inspired by that of Cobb, Gravemeijer, McClain, and colleagues (see Chapter 16). After exploratory interviews and a small field test, we conducted teaching experiments of 12 to 15 lessons in 4 seventh-grade classes in the Netherlands. The design research cycles consisted of three main phases: design of instructional materials, classroom-based teaching experiments, and retrospective analyses. For the retrospective analysis of the data, we used a constant comparative method similar to the methods of Glaser and Strauss (Strauss & Corbin, 1998) and Cobb and Whitenack (1996) to continually generate and test conjectures about students' learning processes.
Conceptualizing An Average As A Stable Feature Of A Noisy Process

The idea of data as a mixture of signal and noise is perhaps the most fundamental concept in statistics. Research suggests, however, that current instruction is not helping students to develop this idea, and that though many students know, for example, how to compute means or medians, they do not know how to apply or interpret them. Part of the problem may be that the interpretations we often use to introduce data summaries, including viewing averages as typical scores or fair shares, provide a poor conceptual basis for using them to represent the entire group for purposes such as comparing one group to another. To explore the challenges of learning to think about data as signal and noise, we examine the "signal/noise" metaphor in the context of three different statistical processes: repeated measures, measuring individuals, and dichotomous events. On the basis of this analysis, we make several recommendations about research and instruction.
Reasoning about variation

"Variation is the reason why people have had to develop sophisticated statistical methods to filter out any messages in data from the surrounding noise" (Wild & Pfannkuch, 1999, p. 236). Both variation, as a concept, and reasoning, as a process, are central to the study of statistics and as such warrant attention from both researchers and educators. This discussion of some recent research attempts to highlight the importance of reasoning about variation. Evolving models of cognitive development in statistical reasoning have been discussed earlier in this book (Chapter 5). The focus in this chapter is on some specific aspects of reasoning about variation.<br>After discussing the nature of variation and its role in the study of statistics, we will introduce some relevant aspects of statistics education. The purpose of the chapter is twofold: first, a review of recent literature concerned, directly or indirectly, with variation; and second, the details of one recent study that investigates reasoning about variation in a sampling situation for students aged 9 to 18. In conclusion, implications from this research for both curriculum development and teaching practice are outlined.
Reasoning About Covariation

Covariation concerns association of variables; that is, correspondence of variation. Reasoning about covariation commonly involves translation processes among raw numerical data, graphical representations, and verbal statements about statistical covariation and causal association. Three skills of reasoning about covariation are investigated: (a) speculative data generation, demonstrated by drawing a graph to represent a verbal statement of covariation, (b) verbal graph interpretation, demonstrated by describing a scatterplot in a verbal statement and by judging a given statement, and (c) numerical graph interpretation, demonstrated by reading a value and interpolating a value. Survey responses from 167 students in grades 3, 5, 7, and 9 are described in four levels of reasoning about covariation. Discussion includes implications for teaching to assist development of reasoning about covariation (a) to consider not just the correspondence of values for a single bivariate data point but the variation of points as a global trend, (b) to consider not just a single variable but the correspondence of two variables, and (c) to balance prior beliefs with data-based observations.
Students' Reasoning about the Normal Distribution

In this chapter we present results from research on students' reasoning about the normal distribution in a university-level introductory course. One hundred and seventeen students took part in a teaching experiment based on the use of computers for nine hours, as part of a 90-hour course. The teaching experiment took place during six class sessions. Three sessions were carried out in a traditional classroom, and in another three sessions students worked on the computer using activities involving the analysis of real data. At the end of the course students were asked to solve three open-ended tasks that involved the use of computers. Semiotic analysis of the students' written protocols as well as interviews with a small number of students were used to classify different aspects of correct and incorrect reasoning about the normal distribution used by students when solving the tasks. Examples of students' reasoning in the different categories are presented.
Developing Reasoning About Samples

Although reasoning about samples and sampling is fundamental to the legitimate practice of statistics, it often receives little attention in the school curriculum. This may be related to the lack of numerical calculations-predominant in the mathematics curriculum-and the descriptive nature of the material associated with the topic. This chapter will extend previous research on students' reasoning about samples by considering longitudinal interviews with 38 students 3 or 4 years after they first discussed their understanding of what a sample was, how samples should be collected, and the representing power of a sample based on its size. Of the six categories of response observed at the time of the initial interviews, all were confirmed after 3 or 4 years, and one additional preliminary level was observed.
Reasoning About Sampling Distributions

This chapter presents a series of research studies focused on the difficulties students experience when learning about sampling distributions. In particular, the chapter traces the seven-year history of an ongoing collaborative research project investigating the impact of students' interaction with computer software tools to improve their reasoning about sampling distributions. For this classroom-based research project, three researchers from two American universities collaborated to develop software, learning activities, and assessment tools to be used in introductory college-level statistics courses. The studies were conducted in five stages, and utilized quantitative assessment data as well as videotaped clinical interviews. As the studies progressed, the research team developed a more complete understanding of the complexities involved in building a deep understanding of sampling distributions, and formulated models to explain the development of students' reasoning.
Primary Teachers' Statistical Reasoning About Data

This study offers a descriptive qualitative analysis of one third-grade teacher's statistical reasoning about data and distribution in the applied context of classroom-based statistical investigation. During this study, the teacher used the process of statistical investigation as a means for teaching about topics across the elementary curriculum, including dinosaurs, animal habitats, and an author study. In this context, the teacher's statistical reasoning plays a central role in the planning and orchestration of the class investigation. The potential for surprise questions, unanticipated responses, and unintended outcomes is high, requiring the teacher to "think on her feet" statistically and react immediately to accomplish content objectives as well as to convey correct statistical principles and reasoning. This study explores the complexity of teaching and learning statistics, and offers insight into the role and interplay of statistical knowledge and context.
Secondary Teachers' Statistical Reasoning In Comparing Two Groups

The importance of distributions in understanding statistics has been well articulated in this book by other researchers (for example, Bakker & Gravemeijer, Chapter 7; Ben-Zvi, Chapter 6). The task of comparing two distributions provides further insight into this area of research, in particular that of variation, as well as to motivate other aspects of statistical reasoning. The research study described here was conducted at the end of a 6-month professional development sequence designed to assist secondary teachers in making sense of their students' results on a state-mandated academic test. In the United States, schools are currently under tremendous pressure to increase student test scores on state-developed academic tests.<br>This chapter focuses on the statistical reasoning of four secondary teachers during interviews conducted at the end of the professional development sequence. The teachers conducted investigations using the software Fathom™ in addressing the research question: "How do you decide whether two groups are different?" Qualitative analysis examines the responses during these interviews, in which the teachers were asked to describe the relative performance of two groups of students in a school on their statewide mathematics test. Pre- and posttest quantitative analysis of statistical content knowledge provides triangulation (Stake, 1994), giving further insight into the teachers' understanding.

Pages