Journal Article

  • This article uses a case study of 2001 town and city data that we analyzed for Boston Magazine. We use this case study to demonstrate the challenges of creating a valid ranking structure. The data consist of three composite indices for 147 individual townships in the Boston metropolitan area representing measures of public safety; the environment; and health. We report the data and the basic ranking procedure used in the magazine article, as well as a discussion of alternative ranking procedures. In particular, we demonstrate the impact of additional adjustment for the size of population, even when per capita data are used. This case study presents an opportunity for discussion of fundamental data analysis concepts in all levels of statistics courses.

  • Three basic theorems concerning expected values and variances of sums and products of random variables play an important role in mathematical statistics and its applications in education, business, the social sciences, and the natural sciences. A solid understanding of these theorems requires that students be familiar with the proofs of these theorems. But while students who major in mathematics and other technical fields should have no difficulties coping with these proofs, students who major in education, business, and the social sciences often find it difficult to follow these proofs. In many textbooks and courses in statistics which are geared to the latter group, mathematical proofs are sometimes omitted because students find the mathematics too confusing. In this paper, we present a simpler approach to these proofs. This paper will be useful for those who teach students whose level of mathematical maturity does not include a solid grasp of differential calculus.

  • Textbooks and websites today abound with real data. One neglected issue is that statistical investigations often require a good deal of cleaning to ready data for analysis. The purpose of this dataset and exercise is to teach students to use exploratory tools to identify erroneous observations. This article discusses the merits of such an exercise and provides a team project, problem data, cleaned data for instructors, and reflections on past experiences. The main goal is to give instructors a prepared project for their students to perform realistic data preparation and subsequent analysis. The data for this project involve categorical and continuous variables for subjects age 65 and over testing calcium, inorganic phosphorus, and alkaline phosphatase levels in the blood. The project described in this article involves summary analysis, but the cleaned data could also be used for

  • Statistics textbooks for undergraduates have not caught up with the enormous amount of analysis of Internet data that is taking place these days. Case studies that use Web server log data or Internet network traffic data are rare in undergraduate Statistics education. And yet these data provide numerous examples of skewed and bimodal distributions, of distributions with thick tails that do not follow the usual models studied in class, and many other interesting statistical curiosities. This paper summarizes the results of research in two areas of Internet data analysis: users' web browsing behavior and network performance. We present some of the main questions analyzed in the literature, some unsolved problems, and some typical data analysis methods used. We illustrate the questions and the methods with large data sets. The data sets were obtained from the publicly available pool of data and had to be processed and transformed to make them available for classroom exercises. Students in Introductory Statistics classes as well as Probability and Mathematical Statistics courses have responded to the stories behind these data sets and their analysis very well. The message in the stories can be conveyed at a descriptive or a more advanced level.

  • We present and discuss three examples of misapplication of the notion of conditional probability. In each example, we present the problem along with a published and/or well-known incorrect - but seemingly plausible - solution. We then give a careful treatment of the correct solution, in large part to show how careful application of basic probability rules can help students to spot and avoid these mistakes. With each example, we also hope to illustrate the importance of having students draw a tree diagram and/or a sample space for probability problems not involving data (i.e., where a contingency table might not be obviously applicable).

  • In this paper, we consider some combinatorial and statistical aspects of the popular Powerball lottery game. It is not difficult for students in an introductory statistics course to compute the probabilities of winning various prizes, including the jackpot in the Powerball game. Assuming a unique jackpot winner, it is not difficult to find the expected value and the variance of the probability distribution for the dollar prize amount. In certain circumstances, the expected value is positive, which might suggest that it would be desirable to buy Powerball tickets. However, due to the extremely high coefficient of variation in this problem, we use the law of large numbers to show that we would need to buy an untenable number of tickets to be reasonably confident of making a profit. We also consider the impact of sharing the jackpot with other winners.

  • Many textbooks teach a rule of thumb stating that the mean is right of the median under right skew, and left of the median under left skew. This rule fails with surprising frequency. It can fail in multimodal distributions, or in distributions where one tail is long but the other is heavy. Most commonly, though, the rule fails in discrete distributions where the areas to the left and right of the median are not equal. Such distributions not only contradict the textbook relationship between mean, median, and skew, they also contradict the textbook interpretation of the median. We discuss ways to correct ideas about mean, median, and skew, while enhancing the desired intuition.

  • Previous research has linked perfectionism to anxiety in the statistics classroom and academic performance in general. This article investigates the impact of the individual components of perfectionism on academic performance of students in the statistics classroom. The results of this research show a clear positive relationship between a studentÅfs personal standards and academic performance consistent with the literature. Surprisingly, the inherent need of some students for organization and structure was found to be negatively related to academic performance. This finding suggests that the organization of statistics as perceived by some students may not always foster understanding, resulting in student confusion and lack of achievement. This infers that statistics instructors may need to put sufficient emphasis on the underlying composition of statistical ideas and the linking of statistical techniques that are presented in the classroom and in the textbook. The implications of these results are discussed in terms of current trends in the reform of the statistics curriculum and approaches that may improve the clarity of the underlying structure of statistics.

  • A data set contained in the Journal of Statistical Education's data archive provides a way of exploring regression analysis at a variety of teaching levels. An appropriate functional form for the relationship between percentage body fat and the BMI is shown to be the semi-logarithmic, with variation in the BMI accounting for a little over half of the variation in body fat. The fairly modest strength of the relationship implies that confidence intervals for body fat, and tolerance intervals for BMI, can be quite wide, so that strict reliance on the BMI as a measure of body fat, and hence obesity, is unwarranted. Nevertheless, when fitting percentage body fat as a function of the class of "power weight for height indices", i.e., indices of the form weight/heightp, the BMI, with a height exponent of p = 2, is an appropriate choice to make.

  • Many leaders of our profession have called for improvements in the way we educate statisticians. Sound recommendations have been made by many, based on real-world experience in the practice of statistical science. These calls for reform have gone largely unheeded, at least in part because of our current paradigm of statistical education. Statistics is seen, by many, as strictly a graduate discipline, yet constraints on the time to complete a graduate degree makes adopting many of the reforms that have been suggested very difficult. It is argued in this paper that a new paradigm of statistical education is needed that provides for strong undergraduate programs in statistics. Such programs would give the profession wider recognition and provide additional entries into the discipline.

Pages

register