Textbooks and websites today abound with real data. One neglected issue is that statistical investigations often require a good deal of cleaning to ready data for analysis. The purpose of this dataset and exercise is to teach students to use exploratory tools to identify erroneous observations. This article discusses the merits of such an exercise and provides a team project, problem data, cleaned data for instructors, and reflections on past experiences. The main goal is to give instructors a prepared project for their students to perform realistic data preparation and subsequent analysis. The data for this project involve categorical and continuous variables for subjects age 65 and over testing calcium, inorganic phosphorus, and alkaline phosphatase levels in the blood. The project described in this article involves summary analysis, but the cleaned data could also be used for
The CAUSE Research Group is supported in part by a member initiative grant from the American Statistical Association’s Section on Statistics and Data Science Education