Teaching

  • The 93CARS dataset contains information on 93 new cars for the 1993 model year. Measures given include price, mpg ratings, engine size, body size, and indicators of features. The 26 variables in the dataset offer sufficient variety to illustrate a broad range of statistical techniques typically found in introductory courses.

  • Four datasets (nfl93.dat, nfl94.dat, nfl95.dat, nfl96.dat) contain National Football League game results for recent seasons. In addition to game scores, the data give oddsmakers' pointspreads and over/under values for each game.

  • I trace the development of a new course in modern data analysis involving a wide spectrum of statistical techniques. Because the course is based entirely on case studies, real-data settings, and student projects and is computer-intensive, a series of challenges facing many instructors are addressed. In a single semester, students explore data using tools from EDA, multiple regression, analysis of variance, time series analysis, and categorical data analysis. The focus is on understanding and forecasting in a variety of data settings, learning how to summarize relationships and measure how well these relationships fit data, and how to make meaningful statistical inferences when the usual assumptions do not hold. The course emphasizes what the statistical process is all about: how to conduct studies, what the results mean, and what can be inferred about the whole from pieces of evidence.

  • This article explains why and how a course in general linear models was restructured. This restructuring resulted from a need to more fully understand traditional teaching evaluations, coupled with a desire to introduce more meaningful data into the course. This led to the incorporation of a longitudinal dataset of teaching evaluations into the lecture material and assignments. The result was a deeper appreciation of how students perceive my teaching, specifically, and a greater understanding of how statistics courses, in general, can be taught more effectively.

  • On her death in 1910, Florence Nightingale left a vast collection of reports, letters, notes and other written material. There are numerous publications that make use of this material, often highlighting Florence's attitude to a particular issue. In this paper we gather a set of quotations and construct a dialogue with Florence Nightingale on the subject of statistics. Our dialogue draws attention to strong points of connection between Florence Nightingale's use of statistics and modern evidence-based approaches to medicine and public health. We offer our dialogue as a memorable way to draw the attention of students to the key role of data-based evidence in medicine and in the conduct of public affairs.

  • Students often come to their first statistics class with the preconception that statistics is confusing and dull. This problem is compounded when even introductory techniques are steeped in jargon. One approach that can overcome some of these problems is to align the statistical techniques under study with elements from students' everyday experiences. The use of simple physical analogies is a powerful way to motivate even complicated statistical ideas. In this article, I describe several analogies, some well known and some new, that I have found useful. The analogies are designed to demystify statistical ideas by placing them in a physical context and by appealing to students' common experiences. As a result, some frequent misconceptions and mistakes about statistical techniques can be addressed.

  • Laboratory experiments using spectrophotometers and pH meters were incorporated into an undergraduate introductory statistics course in order to create an interdisciplinary approach of teaching statistics to non-statistics majors. By conducting laboratory experiments commonly associated with science-based curricula, students were exposed to the relationship between science and statistics through experimental design and data analysis. The laboratory experiments used in the course are related to fields such as chemistry, biology, and environmental sciences and are described in this article.

  • The CIGARETTE dataset contains measurements of weight and tar, nicotine, and carbon monoxide content for 25 brands of domestic cigarettes. The dataset is useful for introducing the ideas of multiple regression and provides examples of an outlier and a pair of collinear variables.

  • The Electric Bill dataset contains monthly household electric billing charges for ten years. In addition, there are values for such potential explanatory variables as temperature, heating and cooling degree days, number in household, and indicator variables for a new electric meter and new heat pumps. The values provide a real dataset to use for applications ranging from simple graphical analysis through a variety of time series and causal forecasting methods. The dataset also is suited to spreadsheet applications for break-even calculations and optimization. With knowledge of the utility's tiered rate function, the bill amount can be converted to an estimate of the number of kilowatt hours used. A series of assignment questions is included and the accompanying Instructor's Manual provides solutions.

  • Statistics is commonly taught as a set of techniques to aid in decision making, by extracting information from data. It is argued here that the underlying purpose, often implicit rather than explicit, of every statistical analysis is to establish one or more probability models that can be used to predict values of one or more variables. Such a model constitutes 'information' only in the sense, and to the extent, that it provides predictions of sufficient quality to be useful for decision making. The quality of the decision making is determined by the quality of the predictions, and hence by that of the models used.<br>Using natural criteria, the 'best predictions' for nominal and numeric variables are, respectively, the mode and mean. For a nominal variable, the quality of a prediction is measured by the probability of error. For a numeric variable, it is specified using a prediction interval. Presenting statistical analysis in this way provides students with a clearer understanding of what a statistical analysis is, and its role in decision making.

Pages

register