Missing data mechanisms, methods of handling missing data, and the potential impact of missing data on study results are usually not taught until graduate school. However, the appropriate handling of missing data is fundamental to biomedical research and should be introduced earlier on in a student's education. The Summer Institute for Training in Biostatistics (SIBS) provides practical experience to motivate trainees to pursue graduate training and biomedical research. Since 2010, SIBS Pittsburgh has demonstrated the feasibility of introducing missing data concepts to trainees in a small-group project-based setting that involves both simulation and data analysis. After learning about missing data mechanisms and statistical techniques, trainees apply what they have learned to a NIDDK/NIH-funded Hepatitis C treatment study, to examine how various hypothesized missing data patterns can affect results. A simulation is also used to examine the bias and precision of these methods under each missing data pattern. Our experience shows that under such project-based training, advanced topics, such as missing data, can be presented to trainees with limited statistical preparation, and ultimately, can further their statistical literacy and reasoning.
The CAUSE Research Group is supported in part by a member initiative grant from the American Statistical Association’s Section on Statistics and Data Science Education