Chance News 50: Difference between revisions
Line 140: | Line 140: | ||
Peter's approach is to provide a simple probability problem and then to show that the method used to solve this problem is the same as the method to solve real life problems.<br> | Peter's approach is to provide a simple probability problem and then to show that the method used to solve this problem is the same as the method to solve real life problems.<br> | ||
His simple problem (which we shall see is not so simple) can be described as follows. If you toss a coin three times there are eight possible outcomes (patterns) | His simple problem (which we shall see is not so simple) can be described as follows. If you toss a coin three times there are eight possible outcomes (patterns): HHH, HTT, HHT, HTH, TTT, TTH, THH, TTH. For our game Peter and Paul each choose one of these eight patters. Let's assume that Peter chooses HTT and Paul choose HTH. Then we toss a coin a sequence of times and the first player whose pattern occurs wins. Most people would say that the probability that Paul win is 1/2 but alas that is not correct. There is a huge literature on finding the probability that Peter wins, and the expected time until a particular pattern appears. | ||
We discussed this problem in [http://www.dartmouth.edu/~chance/chance_news/recent_news/chance_news_4.12.html#coin%20flipper Chance News 4.12] and I wrote | We discussed this problem in [http://www.dartmouth.edu/~chance/chance_news/recent_news/chance_news_4.12.html#coin%20flipper Chance News 4.12] and I wrote |
Revision as of 14:59, 3 July 2009
Quotation
Probability is a mathematical discipline whose aims are akins to those, for example, of geometry of analytical mechanics. In each field we must carefully distinguish three aspects of the theory: (a) the formal logical content, (b) the intuitive background, and (c) the applications. The character, and the charm, of the whole structure cannot be appreciated without considering all three aspects in their proper relation.
An Introduction to Probability Theory and its Applications
This quotation was found at the Probability Web which our readers will enjoy.
Forsooths
http://graphics8.nytimes.com/images/2009/06/27/opinion/27blowlarge.jpg
The accompanying article is here
Swine flu pandemonium III
"Fourth Connecticut Resident With Swine Flu Dies" by Arielle Levin Becker, The Hartford Courant, June 19, 2009
See Chance News 49 [1] for two earlier stories about the second and third cases of swine flu in Connecticut.
A fourth Connecticut death has been "linked" to swine flu.
The person was between 40 and 49 years old and had underlying medical conditions that increased the risk for serious illness from flu, the state Department of Public health said.
To date there have been 767 confirmed cases of swine flu, 28 of the cases had been hospitalized, and 19 of the hospitalized were from the largest cities. All four deaths occurred in people with other medical problems who were hospitalized at the time of death.
Here is the data to date about Connecticut deaths from swine flu:
1 death – 395 confirmed cases – June 4
2 deaths – 637 confirmed cases – June 11
3 deaths – 693 confirmed cases – June 15
4 deaths – 767 confirmed cases – June 17
Discussion
1. Would you advise Connecticut residents to move out of large cities to avoid swine flu? Could the victims have been identified as being from the largest cities because they died in large city hospitals, as opposed to having resided in large cities?
2. Would you advise Connecticut residents with swine flu to avoid hospitals?
Submitted by Margaret Cibes
Swine flu pandemonium IV
“Fever, Chills…and Losses: More Companies Should Be Preparing for an Influenza Pandemic”
by Amin Mawani, The Wall Street Journal, June 22, 2009
Employers are urged to prepare for a possibly sizeable increase in employee absenteeism from swine flu, the first pandemic declared by the World Health Organization in 41 years. The World Economic Forum predicts a $500 billion economic impact from the pandemic.
The good news is that employee absenteeism—and its financial toll on employers—may be controlled to a large extent with adequate planning and stockpiling of antiviral medication, masks and gowns.
The bad news is that few companies have taken steps to protect themselves. A 2007 survey reported at a Harvard Business School conference on pandemic planning found that while 88% of companies seemed prepared to deal with a power disruption and 70% with a technological failure, only 13% were prepared for the kind of labor-force disruption that would come with a pandemic.
Companies are advised to use cost-benefit analysis to justify preparedness. A company’s benefits associated with “pandemic preparedness” include “the earnings before interest, taxes, depreciation and amortization … that are preserved because employees aren’t absent. To figure this, managers must establish the contribution employees make to profits. Some of these calculations can get complex ….” A company that is prepared may also have a competitive edge due to reliability in bad times, as well as a reduced likelihood of being liable for negligence in governance.
Submitted by Margaret Cibes
Role of luck in golf
"Winning a Major May Just Be a Matter of Luck", by Jason Turbow, The Wall Street Journal, June 19, 2009
Using data from every PGA leadership board from 1998 to 2001, two business professors, from University of North Carolina and Dartmouth, have used "cubic spline functions" to try to explain the role of luck in professional golf.
"If being on the leaderboard at the end of a tournament was due entirely to skill, we would see the same names every week," said [the Dartmouth researcher]."
Their model aims to predict an individual's score in a tournament based on an estimate of the person's "intrinsic skill level independent of variables like course difficulty and variations over time." A golfer with a higher tournament score than predicted was considered to have had good luck; one with a lower score was considered to have had bad luck.
[I]n all the events the researchers studied, Mr. Woods was the only golfer to win a tournament despite suffering from negative luck.
The brief article includes a table [2] of expected score, actual score, and "luck factor" for players Tiger Woods, Ernie Els, Vijay Singh, Phil Mickelson, Sergio Garcia, and Jim Furyk, in the 2000 U.S. Open.
Submitted by Margaret Cibes
Baseball: More education, more victories?
"Who Has the Brainiest Team in Baseball?", by Jason Turbow, The Wall Street Journal, June 16, 2009
The author studied "30 team media guides" to try to determine whether there is "a correlation between education and victories" in professional baseball. He compared team standings with players' and managers' undergraduate degrees. He found that only about two dozen major league players or managers had undergraduate degrees.
[T]hree "All-Brains" division leaders -- Oakland, Arizona and Washington -- are in last place in real life, while Texas and the Dodgers were last in their divisions in smarts but first in the standings.
Two bloggers [3] wrote:
(1) "[A]re these results really surprising? The best teams are the best teams because they have more good players than the other teams. Good players are likely to have been A) so talented at baseball as to have little incentive to work hard at school and B) so dedicated to the sport that academics would have suffered. If you're a marginal major league talent like Breslow, it makes sense to get a degree with better earnings potential. Not so for the Alex Rodriguezes [sic] and Barry Bondses [sic] of the world."
(2) "How about instead of looking at university experience, check out something that almost every player (from the U.S., at least) would have: SAT scores. Surely there is the occasional ballplayer with a stratospheric score who still opts for baseball over college."
Submitted by Margaret Cibes
Keeping up with the Joneses by lowering utility bills
“Greening With Envy”, by Bonnie Tsui, The Atlantic, August 2009
Robert Cialdini, a social psychologist at Arizona State University, tested 4 different hotel reuse-towels signs to test how well guests responded:
The first sign had the traditional message, asking guests to “do it for the environment.” The second asked guests to “cooperate with the hotel” and “be our partner in this cause” (12 percent less effective than the first). The third stated that the majority of guests in the hotel reused towels at least once during their stay (18 percent more effective). The last message was even more specific: it said that the majority of guests “in this room” had reused their towels. It produced a 33 percent increase in response behavior over the traditional message.
As the chief scientist for Positive Energy, Cialdini is now applying what he learned to encouraging utility consumers to conserve energy by letting them know how much energy they use relative to their neighbors. Based on his software’s analysis of a neighborhood’s energy usage, a utility company can send monthly bills to consumers with information about how a particular consumer’s usage compared to that of his/her neighbors. For example, a consumer who used “58 percent less electricity” might receive a row of smiley faces, while one who used “39 percent more” might receive no smiley faces, a notice that it cost him/her $741 extra, and tips for improvement.
In Sacramento,
people who received personalized “compared with your neighbors” data on their statements reduced their energy use by more than 2 percent over the course of a year. … [W]ith the pilot sample of 35,000 homes, it’s the equivalent of taking 700 homes off the grid. And the cost to the utility is minor: for every dollar a utility spends on a solar power plant, it produces 3 to 4 kilowatt-hours; for every dollar a utility spends on the energy reports, it saves 10 times that.
Submitted by Margaret Cibes
A Probability puzzle
The Mathematical Science Research Institute has a monthly newsletter called The Emissary.This newsletter has, among other things, puzzles. For the Spring 2009 issue, Puzzle 5 was:
Find three random variables X, Y, Z, each uniformly distributed on [0, 1], such that their sum is constant. (Since each random variable has expectation 1 the sum must in fact be 3.)
To better understand this puzzle, consider the case of two random variables X and Y with X a random choice on 0 to 1 and Y = 1- X. Then the sum of X and Y is the constant 1.
Comment: This puzzle is due to Thomas Colhurst
Submitted by Laurie Snell
The Ted Talks
At the Ted talks website we read:
Each year, the world's leading thinkers and doers gather in for an event many describe as the highlight of their year. Attendees have called it "The ultimate brain spa," "Davos for optimists" and "A four-day journey into the future, in the company of those creating it." This event is called TED, and it's truly a conference like no other.
"It was incredible." Malcolm Gladwell
"A mind-opening experience." Amy Tan
"One of the highlights of my entire life." Billy Graham
"I've never experienced anything remotely like it." Jeffrey Katzenberg
"The combined IQ of the attendees is incredible." Bill Gates
Of course we are interested in statistics or probability talks. We found two statistics talks, one by Hans Rosling and another by Peter Donnelly. You can listen to Rosling's talk here.
From the Tedtalk website we read:
Even the most worldly and well-traveled among us will have their perspectives shifted by Hans Rosling. A professor of global health at Sweden’s Karolinska Institute, his current work focuses on dispelling common myths about the so-called developing world, which (he points out) is no longer worlds away from the west. In fact, most of the third world is on the same trajectory toward health and prosperity, and many countries are moving twice as fast as the west did.
What sets Rosling apart isn’t just his apt observations of broad social and economic trends, but the stunning way he presents them. Guaranteed: You’ve never seen data presented like this. in Rosling’s hands, data sings. Trends come to life. And the big picture — usually hazy at best — snaps into sharp focus.
We did indeed find his talk amazing
You can listen to Peter Donnelly's talk here.
From the Ted Talk website we read:
Oxford mathematician Peter Donnelly reveals the common mistakes humans make in interpreting statistics -- and the devastating impact these errors can have on the outcome of criminal trials.
Peter begins with a couple of jokes which we did not find all that funny:
Statisticians are people who like figures but do not have the personality to become accountants
How do you tell the introverted statistician from the extroverted statistician? The extroverted statistician is the one who looks at the other person's shoes.
Peter's approach is to provide a simple probability problem and then to show that the method used to solve this problem is the same as the method to solve real life problems.
His simple problem (which we shall see is not so simple) can be described as follows. If you toss a coin three times there are eight possible outcomes (patterns): HHH, HTT, HHT, HTH, TTT, TTH, THH, TTH. For our game Peter and Paul each choose one of these eight patters. Let's assume that Peter chooses HTT and Paul choose HTH. Then we toss a coin a sequence of times and the first player whose pattern occurs wins. Most people would say that the probability that Paul win is 1/2 but alas that is not correct. There is a huge literature on finding the probability that Peter wins, and the expected time until a particular pattern appears.
We discussed this problem in Chance News 4.12 and I wrote
My favorite way to solve this pattern problem is due to Li, S. R.(1980). A Martingale approach to the study of occurrence of sequence patterns in repeated experiments, Annals of Probability, vol. 8 (1980), pp. 1171-1176). Here is how Li solves the problem.
Suppose you want to find how long, on average, it takes to get the pattern HHH for the first time. Consider the following casino game. A coin is tossed a sequence of times and before each toss a gambler with $1 enters the game. He bets $1 that heads will come up on the next toss. If he loses, he leaves. If he wins, he bets his $2 that heads will come up on the next toss. If he loses, he leaves. If he wins, he bets his $4 that heads will come up next time. If he loses, he leaves. If he wins the pattern HHH has occurred, the game is over and he has won $8.
If the pattern HHH occurred for the first time on the kth toss, the gambler who entered on the k-1st toss won $2, the gambler who entered before the k-2nd toss won $4 and the gambler who entered before the k-3rd toss won $8. Everyone else lost their original dollar bet. Thus the casino paid out $2+$4+$8 = $14.
Now each bet of every gambler a fair bet so the overall game is fair. Thus the expected amount the casino takes in equals the expected amount they pay out. (Here's where you use Martingale Theory). The casino took in $1 from each gambler, so the expected amount the casino took in equals the expected number of tosses required to first get the pattern HHH. Since they paid out $14 this expected time is 14. If we had been trying to get the pattern HTH the next to last gambler would have bet on heads and lost, and so the casino had to pay out only $8 + $2 = $10. Thus the expected time to reach the pattern HTH for the first time is only 10. For THH both of the last two gamblers would bet on tails and lose, so the expected time to reach THH is only 8.
You can also find a nice nice discussion of this coin tossing problem by Martin Gardner, (1974) Mathematical games, Sci. Amer. 10, 120-125. Here you will find an elegant combinatorial solution to this coin tossing problem, due to John Conway. This article is also included in Gardner's book "Time Travel and Other Mathematical Bewilderments" and in some of his other books. You can also see discussions of Penney's problem in Introduction to Probability by Grinstead and Snell on pages 428 and 432.
Donnelly finishes his talk by discussing how this Penney's problem has been used in his field of research DNA and the role of DNA in the courts and illustrate this by the Sally Clark case that we discussed in Chance News 11.01.
Contributed by Laurie Snell
Fraud in Iranian election?
The devil is in the digits
Washington Post, 20 June 2009
Bernd Beber and Alexandra Scacco
Beber and Scacco are doctoral students in political science at Columbia University. In this article they argue that certain patterns in the reported electoral totals from this month's Iranian presidential elections give strong indications of tampering. Iran's Ministry of the Interior released data for 29 provinces, and the authors examined the reported vote totals for the four main candidates, Ahmadinejad, Mousavi, Karroubi and Mohsen Rezai. Among these 116 numbers, the authors focus on the last two digits, which they assert should be uniformly distributed. However, they report two statistical irregularities. First, regarding the final digits, they write
We find too many 7s and not enough 5s in the last digit. We expect each digit (0, 1, 2, and so on) to appear at the end of 10 percent of the vote counts. But in Iran's provincial results, the digit 7 appears 17 percent of the time, and only 4 percent of the results end in the number 5. Two such departures from the average -- a spike of 17 percent or more in one digit and a drop to 4 percent or less in another -- are extremely unlikely. Fewer than four in a hundred non-fraudulent elections would produce such numbers.
Next, they considered the last two digits together, and asked how many of the pairs contain non-adjacent digits (e.g., 32 has adjacent digits while 35 has non-adjacent digits). They report that only 62% of the pairs had non-adjacent digits, compared with the 70% that would be expected for random digits.
Further investigations by Beber and Scacco, this time involving county level data (an average province in Iran contains about 12 counties), are discussed on Andrew Gelman's blog. At the county level, the last digits do not look suspicious, and the county results do add up to the reported province totals. Beber and Scacco speculate that if the province totals were indeed fabricated, as their earlier analysis suggests, then the county totals could have been made to match as follows: the first few digits of each county could be padded get the total to get in the right ballpark, and then fine adjustment on the last digits of just one county would be required to match the province figure. Indeed, they cite discussion of work by Walter Mebane suggesting that the leading digits do look suspicious. Mebane has been regularly updating his analysis online.
A good collection of links related to this discussion from Pollster.com is available here.
Submitted by Bill Peterson, based on posts by Nancy Boynton and others to the Isolated Statisticians mailing list.
What’s good for the goose?
“NYC Issues Geese Evictions”, by Martha T. Moore, USA Today, June 22, 2009
The U.S. Department of Agriculture Wildlife Services has begun removing, for euthanization, about 2,000 geese from areas near New York City’s two airports, at a cost of about $100,000. In addition to the euthanization program, a Port Authority spokesperson stated that it was also “training airport employees to use a shotgun” as a “last resort.”
This program is apparently a response to the January incident in which Canada geese “hit” US Airways Flight 1549, “shutting down the jet's engines and forcing the pilot to ditch in the Hudson River.” Not only were there no fatalities in this incident, but also, according to the NTSB, the geese were not local.
According to FAA data, while the last airline fatality from a bird occurred in Boston in 1960, “the average annual number of large bird strikes has increased 62% since the 1990s.”
Bloggers suggest that, at the very least, the geese could provide food for needy people.[4]
Discussion
1. The last airline fatality from a bird occurred almost 50 years ago. If geese were only a problem with respect to airline safety, might this program be “overkill”? What information and/or data might you need in order to decide whether to support this program?
2. The article tells us that the average annual number of large bird strikes at airplanes has increased by 62% since the 1990s. What additional information and/or data might you need in order to decide whether an increase of 62% is significant, statistically or otherwise?
Submitted by Margaret Cibes
Framing choices
“About Time: Regulation Based On Human Nature”
by Jason Zweig, The Wall Street Journal, June 20, 2009
In this column, Zweig writes about the need to provide consumers with clear, understandable options in choosing among complicated products such as mortgages.
He refers to the 2009 revised edition of “Nudge: Improving Decisions About Health, Wealth, and Happiness”, by Richard Thaler (University of Chicago) and Cass Sunstein (Harvard Law School), who believe that financial institutions should be required to offer at least some “generic” mortgage plans that would make comparison-shopping easier.
The central idea in “Nudge” is what Profs. Thaler and Sunstein call “choice architecture" – the context, format and framing of how decisions are presented to consumers. You will eat more nuts from a big bowl than from a small bowl. You will choose surgery if you are told it offers a 90% chance of survival; you will reject it if you are told there is a 10% chance it will kill you. The same people who would skip investing in a 401(k) if they had to "opt in" to the plan will participate if they have to "opt out" in order to skip it.
Submitted by Margaret Cibes
More on Infuse and Kuklo II
Here is More on the Infuse and Kuklo II story that appeared in Chance News 49.
From [5] Orthopedics This Week we learn that "Medtronic has finally told Senator Charles Grassley how much it paid former Army surgeon Timothy Kuklo, M.D. Over $850,000 between 2001 and 2009." The article goes on to say, "Medtronic continues to dribble out details that raise more questions than answers." The article concludes with "Machiavelli advised his prince to get all the bad news out at once and dribble out the good news. It would be good advice for Medtronic to heed."
Submitted by Paul Alper