NASCAR Winston Cup race results for 1975-2003.


Authors: 
Winner, L.
Editors: 
Stephenson, W. R.
Category: 
Volume: 
14(3)
Year: 
2006
Publisher: 
Journal of Statistics Education.
URL: 
http://www.amstat.org/publications/jse/v14n3/datasets.winner.html
Abstract: 

Stock car racing has seen tremendous growth in popularity in recent years. We introduce two datasets containing results from all Winston Cup races between 1975 and 2003, inclusive. Students can use any number of statistical methods and applications of basic probability on the data to answer a wide range of practical questions. Instructors and students can define many types of events and obtain their corresponding empirical probabilities, as well as gain a hands-on computer-based understanding of conditional probabilities and probability distributions. They can model the rapid growth of the sport based on total payouts by year in real and adjusted dollars, applying linear and exponential growth models that are being taught at earlier stages in introductory statistics courses. Methods of making head-to-head comparisons among pairs of drivers are demonstrated based on their start and finish order, applying a simple to apply categorical method based on matched pairs that students can easily understand, but may not be exposed to in traditional introductory methods courses. Spearman's and Kendall's rank correlation measures are applied to each race to describe the association between starting and finishing positions among drivers, which students can clearly understand are ordinal, as opposed to interval scale outcomes. A wide variety of other potential analyses may also be conducted and are briefly described. The dataset nascard.dat is at the driver/race level and contains variables including: driver name, start and finish positions, car make, laps completed, and prize winnings. The dataset nascarr.dat is at the race level and contains variables including: number of drivers, total prize money, monthly consumer price index, track length, laps completed, numbers of caution flags and lead changes, completion time, and spatial coordinates of the track. These datasets offer students and instructors many opportunities to explore diverse statistical applications.

The CAUSE Research Group is supported in part by a member initiative grant from the American Statistical Association’s Section on Statistics and Data Science Education