Which is more robust against outliers: mean or median? This app demonstrates the (in)stability of these descriptive statistics as the value of an outlier and the number of data points change.
Which is more robust against outliers: mean or median? This app demonstrates the (in)stability of these descriptive statistics as the value of an outlier and the number of data points change.
The Caesar Shift is a translation of the alphabet; for example, a five-letter shift would code the letter a as f, b as g, ... z as e. We describe a five-step process for decoding an encrypted message. First, groups of size 4 construct a frequency table of the letters in two lines of a coded message. Second, students construct a bar chart for a reference message of the frequency of letters in the English language. Third, students create a bar chart of the coded message. Fourth, students visually compare the bar chart of the reference message (step 2) to the bar chart of the coded message (step 3). Based on this comparison, students hypothesize a shift. Fifth, students apply the shift to the coded message. After decoding the message, students are asked a series of questions that assess their ability to see patterns. The questions are geared for higher levels of cognitive reasoning. Key words: bar charts, Caesar Shift, encryption, testing hypotheses
Explore the functionality of your scientific calculator.
This applet is designed to approximate the value of Pi. It accomplishes this purpose by firing random data points at a circle inscribed within a square. The probability of a data point landing within the circle is a ratio of the circle's area to the area of the square.
An applet explores the following problem: A long day hiking through the Grand Canyon has discombobulated this tourist. Unsure of which way he is randomly stumbling, 1/3 of his steps are towards the edge of the cliff, while 2/3 of his steps are towards safety. From where he stands, one step forward will send him tumbling down. What is the probability that he can escape unharmed?
Students explore the definition and interpretations of the probability of an event by investigating the long run proportion of times a sum of 8 is obtained when two balanced dice are rolled repeatedly. Making use of hand calculations, computer simulations, and descriptive techniques, students encounter the laws of large numbers in a familiar setting. By working through the exercises, students will gain a deeper understanding of the qualitative and quantitative relationships between theoretical probability and long run relative frequency. Particularly, students investigate the proximity of the relative frequency of an event to its probability and conclude, from data, the order on which the dispersion of the relative frequency diminishes. Key words: probability, law of large numbers, simulation, estimation
Includes project file for Minitab and coding for a dice rolling simulation.
Poses the following problem: Suppose there was one of six prizes inside your favorite box of cereal. Perhaps it's a pen, a plastic movie character, or a picture card. How many boxes of cereal would you expect to have to buy, to get all six prizes?
Gives some background on the Buffon needle problem. Has a link to an applet that allows one to simulate dropping a needle1, 10, 100, or 1000 times. One also has control over the length of the needle.
The app allows you to see the trade-offs on various types of outlier/anomaly detection algorithms. Outliers are marked with a star and cluster centers with an X.
This UC Berkeley Foundations of Data Science course combines three perspectives: inferential thinking, computational thinking, and real-world relevance. Given data arising from some real-world phenomenon, how does one analyze that data so as to understand that phenomenon? This course teaches critical concepts and skills in computer programming and statistical inference, in conjunction with hands-on analysis of real-world datasets, including economic data, document collections, geographical data, and social networks. It delves into social issues surrounding data analysis such as privacy and design.