# Undergraduate students

• ### Professional Ethics

This is a chapter on ethics excerpted from a book on data science. The book is “Modern Data Science with R,” and the authors are Benjamin J. Baumer, Daniel T. Kaplan, and Nicholas J. Horton. The chapter presents several ethical dilemmas, then a framework to use when evaluating ethical issues. Then it discusses the dilemmas again, now resolving them.

• ### Introduction to SQL

This site is a lesson on using SQL. It starts with a simple SELECT query. The user must type in the correct command to select certain columns from a database. Once the user has completed the first lesson, then he or she may continue to more complicated lessons.

• ### Introduction to Survival Analysis

This site is a description of the mathematics behind survival analysis. It starts with a definition of the survival function. Then it discusses estimating the survival function with the Kaplan-Meier curve.  Then it discusses comparing survival curves. Finally, there is a discussion of Cox Proportional Hazards regression analysis.

• ### Clinical Trials Repository

This site is a government-run repository of information on current and completed clinical trials. Users can search for clinical trials by disease type and also by whether the trial is currently recruiting. Then a detailed description of the trial is given. This can be used in a classroom setting to discuss design issues and ethical issues with clinical trials.

• ### Body Worn Camera Experiment

This website is a summary of a randomized controlled trial of a metropolitan police department's body-worn camera program. It is useful in class to talk about the design of the experiment and also to talk about how they state their results. Their results are given as confidence intervals for differences.

• ### Sample Size Determination In Research

This is a complete lesson module (including example problems with answers to selected problems) for the purpose of enabling students to: 1) Provide examples demonstrating how the margin of error, effect size, and variability of the outcome affect sample size computations. 2) Compute the sample size required to estimate population parameters with precision. 3) Interpret statistical power in tests of hypothesis. 4) Compute the sample size required to ensure high power when hypothesis testing.
• ### Why Do We Need to Compute the Power of a Test?

When performing a hypothesis test about the population mean, a possible reason for the failure of rejection of the null hypothesis is that there's an insufficient sample size to achieve a powerful test. Using a small data set, Minitab is used to check for normality of the data, to perform a 1-Sample t test, and to compute Power and Sample Size for 1-Sample t.

• ### Testing Assumptions: Normality & Equal Variances

Document (pdf) illustrating a test of normality using an Anderson-Darling test in MINITAB and a test of equality of variances with an F-test in EXCEL.
• ### Data Collection: Information is Beautiful

This site did a lot of data visualization on many hot button topics. They provide the raw data that they used to create their graphs at this page. These data sets are kept in Google Doc spreadsheets.
• ### Census Bureau Data Visualization Gallery

The Census Bureau has made many data visualizations of the data it collects. It is a good collections of maps, treemaps, an age/sex pyramid, and of course more familiar graphs, like bar graphs.