Data Presentation

  • Residual plots and other diagnostics are important to deciding whether or not linear regression is appropriate for a set of data. Many students might believe that if the correlation coefficient is strong enough, these diagnostic checks are not important. The data set included in this activity was created to lure students into a situation that looks on the surface to be appropriate for the use of linear regression but is instead based (loosely) on a quadratic function. Key words: regression, residuals
    0
    No votes yet
  • This group activity illustrates the concepts of size and power of a test through simulation. Students simulate binomial data by repeatedly rolling a ten-sided die, and they use their simulated data to estimate the size of a binomial test. They carry out further simulations to estimate the power of the test. After pooling their data with that of other groups, they construct a power curve. A theoretical power curve is also constructed, and the students discuss why there are differences between the expected and estimated curves. Key words: Power, size, hypothesis testing, binomial distribution

    0
    No votes yet
  • A mathematical word processor that includes an easy-to-use computer algebra system (MuPad). Products include Scientific Wokplace, Scientific Word, Scientific Notebook, and MuPad Pro. Student version are available.

    0
    No votes yet
  • The activity is designed to help students develop a better intuitive understanding of what is meant by variability in statistics. Emphasis is placed on the standard deviation as a measure of variability. As they learn about the standard deviation, many students focus on the variability of bar heights in a histogram when asked to compare the variability of two distributions. For these students, variability refers to the "variation" in bar heights. Other students may focus only on the range of values, or the number of bars in a histogram, and conclude that two distributions are identical in variability even when it is clearly not the case. This activity can help students discover that the standard deviation is a measure of the density of values about the mean of a distribution and to become more aware of how clusters, gaps, and extreme values affect the standard deviation. Key words: Variability, standard deviation

    0
    No votes yet
  • This group activity focuses on conducting an experiment to determine which of two brands of paper towels are more absorbent by measuring the amount of water absorbed. A two-sample t-test can be used to analyze the data, or simple graphics and descriptive statistics can be used as an exploratory analysis. Students are asked to think about design issues, and to write a short report stating their results and conclusions, along with an evaluation of the experimental design. Key words: Two-sample t-test

    0
    No votes yet
  • The program DistCalc calculates probabilities and critical values for the most important distributions. The purpose of this program is to show the concept of critical values and the replacement of printed distribution tables. The Distribution Calculator offers calculations for the normal distribution, the t distribution, the chi-square distribution, and the F distribution.

    0
    No votes yet
  • This program visualizes the effects of outliers to regression lines. The user may pick up a point with the mouse and move it across the chart. The resulting regression line is automatically adjusted after each movement, showing the effect in an immediate and impressive way. The program Leverage allows one to experiment with the leverage effect. You can create a random sample of data noisy points on a line. Dragging one of the points away from the regression line immediately shows the effect, as the regression line is recalculated and moves according to the current data set. Not online: user has to download the program.

    0
    No votes yet
  • This program has been written to explore the relationship between the data points and the error surface of the regression problem. On one hand you can learn how to represent a line in two different spaces ({x,y} and {k,d}), and on the other hand you see that solving the regression problem is nothing else than finding the minimum in the error surface.

    0
    No votes yet
  • The datasets on this page are classified by analysis technique (ANOVA, Linear Regression, Markov Chain Monte Carlo, Nonlinear Regression, and Univariate Summary Statistics) and by level of difficulty (lower, average, higher). They were originally intended to test statistical software.
    0
    No votes yet
  • In this handout, students are asked to compare the ages of terminated employees to the ages of retained employees. Students will use the comparison to decide if the data supports the conclusion of age discrimination.
    0
    No votes yet

Pages