Randomization and the Undergraduate Curriculum

George Cobb, Mount Holyoke College

I’m writing about implications of simulation-based inference (SBI) for the undergraduate statistics major, but also for students who take only one or a few statistics courses, because these implications apply also to the undergraduate major. I begin with some strengths and omissions of SBI in its current forms.[pullquote]… the SBI course serves as a foundation for more advanced courses. How does it compare with more traditional first courses? Potentially, it offers better preparation for additional courses, but the details will depend on rethinking the intermediate and advanced curriculum. [/pullquote]

A. Strengths of SBI.

Here are six (somewhat arbitrary) strengths: economy, emphasis, and unity; embedding, breadth, and acceleration.

Economy of technical prerequisites and co-requisites: SBI substitutes simulation for theory. Probabilities are ratios; normal-based methods are empirical shortcuts. They needn’t be derived and justified; they can be checked by simulation. SBI allows you to introduce significance testing[i] as early as the first day of class.[ii] Introducing formal inference early allows students an entire semester to absorb the logic. Moreover, the economy gives extra class time and cognitive energy for other substantives issues.

Emphasis on a single paradigm for inference based on two questions: Given the statistical hypothesis, which values of the data are so extreme as to create enough cognitive dissonance to justify rejecting H₀? Inverting the same logic, which parameter values are not plausible?

Unity of the underlying logic: Whether the research question is about a single proportion, comparing two proportions, or comparing several means, the logic is always the same: choose a test statistic, simulate the data production process, and use the resulting distribution to interpret the observed value of the test statistic.

These features have two great benefits: they highlight and repeatedly reinforce the main principle that underlies Fisher’s theory of inference, and they free up time to address issues that matter far more than formulas. I suggest three: embedding, breadth, and acceleration.

Embedding formal inference within the process of scientific investigation. Schematically, we can think of an investigation starting with a research question, following Popper’s principle of falsifiability through prediction, designed experiment, statistical hypothesis, p-value, conclusion based on strength of evidence, and most importantly, scope/limitation of evidence. As part of the latter, the efficiency of SBI allows for attention to the relation between data production and the scope/limitation of inferences.

Breadth: SBI applies to all classical inference. Whatever the test statistic, regardless of whether it has a known distribution, its behavior under H₀ can be estimated by simulation. Moreover, SBI applies in situations where there is no alternative hypothesis.[iii]

Acceleration: a faster track to intermediate methods and models. For some students, the economy and breadth of SBI allow the first course to include such topics as multivariate regression, ANOVA, and logistic regression.

Despite these strengths, the current randomization-based introductory course is no panacea. In particular it fails to address two important ways to think about data.[pullquote]I feel confident that SBI, suitably modified, will help lead the science of data and its teaching into a more robust and exciting future. Meanwhile, we need a lot of opinionated people opining.[/pullquote]

B. Two omissions: Bayesian thinking and algorithmic thinking.

As I see it, SBI in its current version, is guilty of these two major omissions. Both correspond to important ways of thinking with data, ways that complement an approach based on hypothesis tests and confidence intervals, whether traditional or SBI. These omitted ways of thinking are partners, not competitors. (Learning from data is an enterprise too deep and subtle to yield to any single abstract framework.)

Bayesian thinking. I agree with Dempster[iv] that there are important roles in data analysis both for Fisher’s approach to testing and Bayesian posterior intervals. More recently, Bayesian analysis of multilevel models with fixed and random terms has become standard in applied statistics.

Algorithmic data analysis. Bayesian and “classical methods” (whether SBI or standard) depend on probability models. Their importance continues, but algorithmic methods that require no probability models, e.g. classification and regression trees (see Breiman et al.[v]) are increasingly central to practice.

With this as background I turn to the potential impact of SBI on the undergraduate curriculum generally, and the undergraduate statistics major in particular.

C. Preview: The statistics major is inevitable.

Many fortunate readers may belong to departments that already offer a major in statistics, but surely many other less fortunate readers belong to departments that teach at most a small handful of statistics courses. Although the department from which I retired now offers a major in statistics, for many years I belonged to the less fortunate group, and so, with first-hand sympathy, I consider implications for a four-tiered hierarchy of students:

students who take only one statistics course and major in subjects that do not use statistics;
students who take only one statistics course but major in subjects that do use statistical thinking;
students who take one or more additional statistics courses but stop short of a major; and
students who major in statistics.

Given my intended emphasis on the last, I go through the first three more quickly, but I suggest that the important advantages they offer accrue also to students who major in statistics.

D. Implications for students who do not major in statistics (but also for those who do)

D1. Implications for general undergraduate education. Consider first the history or music major who takes only the introductory course and does not encounter statistical thinking later in the undergraduate curriculum. The SBI course will be the only academic exposure to statistics. What does SBI offer that a more traditional introductory course does not? Here are three:

Not discouraging/deterring those with minimal technical skills;
An appreciation for the role of inference in the natural and social sciences and the importance of falsifiability;
A sense of the relationship between the process of data production and the scope and limitations of inference; and, more broadly,
A sense of the tension between Pascal’s two ways of thinking.[vi]

D2. Implications for natural and social science students who take only the introductory course in statistics. Although this course may be their only one with an explicit and exclusive focus on statistics it nevertheless offers these majors advantages over the traditional course. In addition to the advantages listed above, it provides the background necessary to understand the need for statistics in their fields and to understand the use of statistics in published research without having to know the mathematical basis for the formulas used. Moreover, depending on local conditions, SBI can offer students of the introductory course experience with ANOVA and multiple regression, logistic regression.

D3. Implications for students who take additional statistics courses. For students at levels (1) and (2), an SBI course is, to borrow a word made famous by George Carlin, “terminal.” For other students, the SBI course serves as a foundation for more advanced courses. How does it compare with more traditional first courses? Potentially, it offers better preparation for additional courses, but the details will depend on rethinking the intermediate and advanced curriculum. See E below.

E. Additional implications for the undergraduate statistics major

Big data! Business analytics! Bioinformatics! Hype aside, if your department doesn’t yet offer a major in statistics, take heart: it’s just a matter of time. When I first accepted a teaching position in 1974, I knew of only three other statistics PhDs in the whole country who taught at Liberal Arts colleges. Now, Amherst College alone has three designated tenure track positions in statistics and 24 declared statistics majors. Last year UC Berkeley graduated about 150 statistics majors. The coming Tsunami will swamp any resistance.

Meanwhile, we can all celebrate by trying to think aggressively about what opportunities SBI in a first course can offer a program for statistics majors. The opportunities are vast, but the precedents are scant to non-existent. Moreover, as we statisticians know, things vary. At one extreme, some undergraduate programs are similar to some applied masters programs. At another extreme, some undergraduate programs are based on just a handful of statistics courses together with courses in mathematics and perhaps computer science or an area of application. I suggest that consensus will require a decade of curricular experimentation and evolution, and that at this early stage in the process, we can all contribute to a vigorous discussion of goals and principles. In that spirit, I offer three thoughts. Two are aimed at addressing the omissions I described in (B) above. The third is more speculative, more open-ended, and most important.

E1. Bayesian inference: SBI prepares the way.

SBI emphasizes a fundamental principle of formal inference – inversion. (See A, Focus.) The same principle underlies Bayesian logic.[vii]
SBI uses simulation to estimate and understand probabilities. The same thinking can be an entrée for Bayes.[viii]

E2. Algorithmic data analysis. SBI introduces the idea of repetition to assess how well a method behaves. So does algorithmic thinking. Assessing a p-value is similar to assessing error rates, e.g., a classification algorithm for detecting spam. Neyman-Pearson logic is based on Type I and Type II errors, which correspond directly to error rates – false positives and false negatives — for detecting spam.

E3. What else? This is the question that is most determinative of our future, most important for us all to think about, most challenging, and so most in need of creative curricular experimentation. Here are three tentative imperatives:

Develop eclectic introductory courses that introduce and contrast Fisherian testing in the context of a scientific investigation, Bayesian estimation, and algorithmic data analysis. Build a statistics curriculum on this base.
Develop new intermediate applied courses, e.g., applied Bayes and algorithmic methods, based on the introductory course..
Develop a new pair of courses to replace the traditional probability and math stat sequence. Put mathematical theory (probability and old-style math stat theory) in a separate course. Put the abstract logic of contemporary stat theory in a course that does not require probability.

In conclusion, I offer a fourth imperative: Regard the first three as an invitation to disagree and to choose your own path. I feel confident that SBI, suitably modified, will help lead the science of data and its teaching into a more robust and exciting future. Meanwhile, we need a lot of opinionated people opining.

References:

[i] As many readers may know, hypothesis testing has been under attack in recent years, especially by social scientists. Some (e.g., David Trafimow & Michael Marks (2015) Editorial, Basic and Applied Social Psychology, 37:1, 1-2, DOI: 10.1080/01973533. 2015.1012991) argue that we should avoid all formal inference in favor of descriptive statistics. In my opinion such a ban is tantamount to outlawing all table saws because some careless users have lost fingers. Others (e.g., Ziliak, Stephen T. and Dierdre N. McCloskey (2009). The Cult of Statistical Significance: How the Standard Error Costs Us Jobs, Justice, and Lives. Ann Arbor: U. of Mich. Press) argue that Bayes is the answer. Long term, I expect the introductory course to come to offer an eclectic mix of approaches. See Section B.

[ii] See, e.g. Robert L. Wardrop (1995). Statistics: Learning in the Presence of Variation. Dubuque, IA: Wm. C. Brown, and Ann E. Watkins, Richard L. Scheaffer, and George W. Cobb (2011). Statistics: From Data to Decision, 2^nd ed. New York: Wiley.

[iii] See, e.g., Nicholas J. Gotelli and Gary R. Graves (1996). Null Models in Ecology. Smithsonian.

[iv] Arthur P. Dempster (1971), “Model Searching and Estimation in the Logic of Inference,” in Foundations of Statistical Inference (eds. V.P.Godambe and D.A.Sprott). Toronto: Holt, Rinehart and Winston.

[v] Leo Breiman, Jerome H. Friedman, Richard A. Olshen, and Charles J. Stone (1984). Classification and Regression Trees. Boca Raton: Chapman and Hall/CRC.

[vi] Briefly, we might characterize the tension as between abstract deductive thinking and interpretation in context. For a translation of what Pascal actually said, with discussion, see Jacques Barzun (1978) The House of Intellect. Praeger.

[vii] Laplace’s data duplication principle . For a discussion of what Laplace actually said, and discussion, see Anders Hald (1998). A History of Mathematical Statisics: From 1750 to 1930. New York: Wiley, pp. 160 ff.

[viii] One simulation-based approach to Bayesian inference is based on a Monte Carlo method (Russian Roulette, in Herman Kahn (1955). “Use of different Monte Carlo sampling techniques,” Santa Monica: The Rand Corporation. Retrieved from www.rand.org/content/dam/rand/pubs/…/P766.pdf. The algorithm I have used:

(1) Generate a parameter value according to the prior. (2) Simulate a data value y_sim. (3) Compare with the observed data value: If y_sim = y_obs save the parameter value, otherwise kill it. Repeat (1)-(3) thousands of times. The saved values will follow the posterior distribution.)

Simulation-based statistical inference

A blog about teaching introductory statistics with simulation-based inference

Leave a Reply Cancel reply