F08: From black box to shining spotlight: using random forest prediction intervals to illuminate the impact of assumptions in linear regression


By Andrew Sage (Lawrence University)


Information

We introduce a pair of Shiny web applications, allowing users to visualize random forest prediction intervals alongside those produced by linear regression models. The apps are designed to help undergraduate students deepen their understanding of the role that assumptions play in statistical modeling by comparing and contrasting intervals produced by regression models with those produced by more flexible algorithmic techniques. In our accompanying paper, we argue that, contrary to their reputation as a black box, random forests can be used as a spotlight, for educational purposes, illuminating the role of assumptions in regression models and their impact on the shape, width, and coverage rates of prediction intervals. The apps were used and assessed in two different sections of an undergraduate statistical modeling course (40 students total). Pre and post survey results showed that the app and associated discussion helped students better distinguish between the consequences of different model violations, and developed more nuanced thinking about the role of assumptions in statistical modeling.

 

https://predictive-visualizations.shinyapps.io/Prediction_Intervals_ Simulation/
 


USCOTS_Poster_Sage.pdf