2H: Show Me the Missing Data


Juana Sanchez (UCLA)


Abstract

Most data sets provided to students in the introductory statistics course and even upper division courses lack missing data or, if there is missing data, software decides for students to use just the complete cases and ignore the missing data. Marron and Wahed (JSE 2015) argued that the appropriate handling of missing data should be introduced earlier on in a student's education and demonstrated the feasibility of introducing missing data concepts to trainees in a small-group project-based setting that involves both simulation and data analysis. This breakout session demonstrates interactively how introducing methods of handling missing data such as single imputation and multiple imputation in introductory statistics courses could enhance the learning of concepts such as mean, standard error, bias, precision and regression while preparing our students for the data that they are likely to get in their internships and jobs as statisticians/data scientists or their capstone projects. Those participating in this session will be interactively speculating about the missing data and conducting exercises that introduce the missing data handling methodology at several points of the introductory statistics courses' curriculum and participating in discussion of the pros and cons of handling missing data at such early stage.