Chance News 47

From ChanceWiki
Jump to navigation Jump to search

Quotations

It is remarkable that a science which began with the consideration of games of chance should have become the most important object of human knowledge.


Pierre Simon Laplace
Théorie Analytique des Probabilités, 1812,

Forsooths

Their visual acuity is only fractionally – not geometrically – better than that of the common primates from which they were engineered.

Dean Koontz, Seize the Night, 1999

When do you draw the line? When do you take action to avoid that logarithmic point where things take off exponentially?

Burkhard Bilger, "Swamp Things,"
The New Yorker, April 20, 2009

Submitted by Margaret Cibes


Today is President Barack Obama's 100th day in office. How would you grade his performance thus far?

http://www.dartmouth.edu/~chance/forwiki/pie.jpg

Of the 33223 voters in this online, voluntary poll, President Obama received a grade of "F" from 79% of the participants, that is, 26317 voters. The colors in the pie chart represent a grade of "A+", "A", "A-," etc., until the color grey, which refers to the grade of "F".

Submitted by Paul Alper


Elections

Andrew Gelman has an interesting article regarding the statistics of elections. He starts with

“Presidential elections have been closer in the past few decades than they were for most of American history. Here's a list of all the U.S. presidential elections that were decided by less than 1% of the vote:

1880
1884
1888
1960
1968
2000

Funny, huh? Other close ones were 1844 (decided by 1.5% of the vote), 1876 (3%), 1916 (3%), 1976 (2%), 2004 (2.5%).

Four straight close elections in the 1870s-80s, five close elections since 1960, and almost none at any other time.”

Perhaps more interesting is his take on House and Senate elections:

http://www.dartmouth.edu/~chance/forwiki/elections1.gif

In the companion piece written with Nate Silver, the graphic concerning the House is made more evident here:

http://www.dartmouth.edu/~chance/forwiki/elections2.gif

From the graph, “the rate of close elections in the House has declined steadily over the century. If you count closeness in terms of absolute votes rather than percentages, then close elections become even rarer, due to the increasing population. In the first few decades of the twentieth century, there were typically over thirty House seats each election year that were decided by less than 1000 votes; in recent decades it's only been about five in each election year.”

Another way of putting it: “Consider that, in the past decade, there were 2,175 elections to the United States House of Representatives held on Election Days 2000, 2002, 2004, 2006 and 2008. Among these, there were 41 instances — about 1.9 percent — in which the Democratic and Republican candidates each received 49 percent to 51 percent of the vote (our calculations exclude votes cast for minor parties). In the 1990s, by contrast, there were 65 such close elections. And their number increases the further one goes back in time: 88 examples in the 1950s, 108 in the 1930s, 129 in the 1910s.”

Discussion

1. A contention mentioned in the NYT article for this bifurcation of opinion is: “as the economy has become more virtual, individuals can now choose where to live on an ideological rather than an occupational basis: a liberal computer programmer in Texas can settle in blue Austin, and a conservative one in the ruby-red suburbs of Houston.” Argue for and against this assertion of coupling ideology and occupational mobility.

2. Gelman and Silver end their NYT article with “Elections like those in New York’s 20th district or in Minnesota, as contentious as they are, actually hark back to a less divisive era in American politics.” Explain the seeming paradox of a close race indicating less divisiveness.

3. As of this posting, the Minnesota senate seat is unfilled despite a manual recount, a canvassing board followed by a so-called election contest and awaits the results of the appeal to the Minnesota Supreme Court—possibly further. See When Does ‘Close Become Too-Close-to-Call? for an analysis of error rates and how likely it is that the real winner would be Norm Coleman instead of Al Franken who currently leads by 312 votes out of about 2.9 million cast. The Minnesota Supreme Court will hear oral arguments on June 1, 2009.

Submitted by Paul Alper

Bayes theorem in the news

In a letter to the editor of the Miller-McCune magazine, headed "For those of You Who Paid Attention in Statistics Class" [1], Professor Howard Wainer, of The Wharton School, writes about the "mathematical reality ... that so long as only a very small minority of people commit crimes and the criminal justice system is fair ... there will always be a very large proportion of innocent people convicted." He provides a hypothetical example, with calculations based on Bayes theorem.

Some interesting challenges to his assumptions, both from the editor and online bloggers, are included [2]. Professor Wainer's response to an editor's comment includes the following medical example.

Each year in the U.S., 186,000 women are diagnosed, correctly, with breast cancer. Mammograms identify breast cancers correctly 85 percent of the time. But 33.5 million women each year have a mammogram and when there is no cancer it only identifies such with 90 percent accuracy. Thus if you have a mammogram and it results in a positive (you have cancer) result, the probability that you have cancer is: 186,000/(186,000+3.35 million) = 4%.

So if you have a mammogram and it says you are cancer free, believe it. If it says you have cancer, don't believe it.

The only way to fix this [is to] reduce the denominator. Women less than 50 (probably less than 60) without family history of cancer should not have mammograms.

Discussion

1. Consider Professor Wainer's calculation of the probability (4%) of having cancer given a positive mammogram result. Do you agree with the numbers used?

2. Consider Professor Wainer's advice to a person who receives a diagnosis of being cancer free. What is the probability of being cancer free if one receives a negative mammogram result? Do you agree with his advice?

3. Comment on Professor Wainer's advice for "women less than 50."

Submitted by Margaret Cibes

Cell Phones Risky During Pregnancy?

by Daniel J. DeNoon, WebMD [3]

UCLA researcher Jorn Olsen reported to WebMD that:

Kids whose mothers used cell phones while pregnant had a 54% higher risk of behavior problems -- emotional problems, hyperactivity, conduct problems, and peer problems. Kids who used cell phones themselves had an 18% higher risk of behavior problems. And kids with both exposures had an 80% higher risk of behavior problems.

The researchers are said to have been surprised by the strength of the effect, which "did not go away," even after controlling for smoking, social issues, alcohol use, and mothers' psychiatric diagnoses. According to Olsen, "This study just raises suspicion. It does not indicate a strong association, but calls for caution in using cell phones during pregnancy and early childhood."

Olsen says that more research is needed, as does a wireless telephone industry trade group representative, Joseph E. Farren, who tells WebMD that "The overwhelming majority of studies that have been published in scientific journals around the globe show that wireless phones do not pose a health risk."

The study was reported in the July issue of the journal Epidemiology.

Submitted by Margaret Cibes

Financial planning via Monte Carlo simulation

Odds-On Imperfection: Monte Carlo Simulation by Eleanor Laise, The Wall Street Journal, May 2, 2009 [4]

The article discusses how Monte Carlo methods used by many financial planning firms and individuals have consistently underestimated the odds of extreme market events. While use of this mathematical tool "attracts clients and boosts fee income," it appears to have given investors "a false sense of security."

"Here is how a typical Monte Carlo retirement-planning tool might work: The user enters information about his age, earnings, assets, retirement-plan contributions, investment mix and other details. The calculator crunches the numbers on hundreds or thousands of potential market scenarios, guided by assumptions about inflation, volatility and other parameters.

"It then spits out a "success rate," which shows the percentage of market scenarios in which the investor had money remaining at the end of his estimated life span. In many cases, the consequences of failure – say, running out of money at age 80 – aren't laid out.

"Many providers of the tools argue that it is a significant improvement over the traditional retirement-planning approach, which typically involves assuming some set market return, say 8% for U.S. stocks, year after year, an assumption considered unrealistic by academics and financial pros."

Three factors that affect the outcome of a simulation are identified. One is the choice of assumptions for the scenarios. A second is the relatively small number of scenarios ordinarily run on a model (hundreds or thousands, instead of hundreds of thousands).

A third is the shape of the distribution of market returns: Should it be bell-shaped or "fat-tailed"? A Morningstar Inc. analyst has reported that "While a bell-curve model indicates there is almost no chance of a greater than 13% monthly decline in the Standard & Poor's 500-stock index, such declines have happened at least 10 times since 1926." A number of financial analysts are "considering offering clients Monte Carlo scenarios that incorporate fatter-tailed distributions." A graph is provided of the "fat-tailed" distribution of a hypothetical retiree's anticipated market return. [5].

Submitted by Margaret Cibes

Friendship

Except for Voltaire who famously (albeit, possibly apocryphally) said, “Lord, protect me from my friends; I can take care of my enemies,” few doubt the benefits of having friends.

From Tara Parker-Pope we find some surprising side effects of friendship. She suggests looking at an Australian study which “found that older people with a large circle of friends were 22 percent less likely to die during the study period than those with fewer friends.” Further, “last year, Harvard researchers reported that strong social ties could promote brain health as we age.”

She also refers to a 2006 study of nearly 3000 nurses with breast cancer which “found that women without close friends were four times as likely to die from the disease as women with 10 or more friends. And notably, proximity and the amount of contact with a friend wasn’t associated with survival. Just having friends was protective.” She closes her article with a quote from the director of the center for gerontology at Virginia Tech: “People with stronger friendship networks feel like there is someone they can turn to. Friendship is an undervalued resource. The consistent message of these studies is that friends make your life better.”

Discussion

1. Parker-Pope also mentioned researchers here who “studied 34 students at the University of Virginia, taking them to the base of a steep hill and fitting them with a weighted backpack. They were then asked to estimate the steepness of the hill. Some participants stood next to friends during the exercise, while others were alone. The students who stood with friends gave lower estimates of the steepness of the hill. And the longer the friends had known each other, the less steep the hill appeared.” In fact, three of the 34 were excluded because they were deemed outliers. The participants estimated the slant via three different methods as can be seen in the figure below:

http://www.dartmouth.edu/~chance/forwiki/Fig.3.jpg

The “haptic” measurement “"required adjusting a tilt board with a palm rest to be parallel to the hill, importantly, without looking at one’s hand."” As can seen from the above figure, it appears to more accurate than either the “verbal,” merely a guess, or the “visual” which a (presumably crude) disk-like device acted as an aide.

The researchers performed a two-way ANOVA (sex and social support) separately for each of the three measuring methods. They reported the value of each F(1,27) to determine a p-value for each method to see if Friend compared to Alone is statistically significant. So, why the number “27”? From merely looking at the figure, which of the three methods for determining slant would appear to be unrelated to friendship?

2. The above study took place in Virginia. In Plymouth, England the researchers did a similar slant study but this time instead of friendship directly, imagining of support was tested as can be seen from the following figure:

http://www.dartmouth.edu/~chance/forwiki/Fig.4.jpg

This study had 36 participants and similarly to the first study, they did a two-way ANOVA (sex and imagery of support) leading to F(2,30) for each slant measuring technique. So, why the “2” and the “30”? From merely looking at the figure, which of the three methods for determining slant would appear to be unrelated to imagery of support?

3. In either study, “visual” or “verbal” on average markedly overstate the slant of the hill. What does that suggest about people’s ability to judge a task?

4. The researchers admit that for either study, “"Participants in this study were not randomly assigned."” Why would this pose a problem?

5. To give Voltaire his due, Parker-Pope points out that "“A large 2007 study showed an increase of nearly 60 percent in the risk for obesity among people whose friends gained weight."”

Submitted by Paul Alper

Helping Doctors and Patients Make Sense of Health Statistics

Gerd Gigerenzer, Wolfgang Gaissmaier, Elke Kurz-Milcke, Lisa M. Schwartz, and Steven Woloshin Psychological Science in the Public Interest Volume 8, Number 2 November 2007

Milton Eisner, Health Statistian at the National Cancer Institute wrote: "Please be sure this excellent, provocative article is mentioned in Chance News".

The articles begins with a Summary which indicates what they will show in the rest of the article.

Many doctors, patients, journalists, and politicians

alike do not understand what health statistics mean or draw wrong conclusions without noticing. Collective statistical illiteracy refers to the widespread inability to understand the meaning of numbers. For instance, many citizens are unaware that higher survival rates with cancer screening do not imply longer life, or that the statement that mammography screening reduces the risk of dying from breast cancer by 25% in fact means that 1 less woman out of 1,000 will die of the disease. We provide evidence that statistical illiteracy (a) is common to patients, journalists, and physicians; (b) is created by nontransparent framing of information that is sometimes an unintentional result of lack of understanding but can also be a result of intentional efforts to manipulate or persuade people; and (c) can have serious consequences for health.

The causes of statistical illiteracy should not be attributed to cognitive biases alone, but to the emotional nature of the doctor–patient relationship and conflicts of interest in the healthcare system. The classic doctor–patient relation is based on (the physician’s) paternalism and (the patient’s) trust in authority, which make statistical literacy seem unnecessary; so does the traditional combination of determinism (physicians who seek causes, not chances) and the illusion of certainty (patients who seek certainty when there is none).We show that information pamphlets, Web sites, leaflets distributed to doctors by the pharmaceutical industry, and even medical journals often report evidence in nontransparent forms that suggest big benefits of featured interventions and small harms. Without understanding the numbers involved, the public is susceptible to political and commercial manipulation of their anxieties and hopes, which undermines the goals of informed consent and shared decision making.

What can be done? We discuss the importance of teaching statistical thinking and transparent representations in primary and secondary education as well as in medical school. Yet this requires familiarizing children early on with the concept of probability and teaching statistical literacy as the art of solving real-world problems rather than applying formulas to toy problems about coins and dice. A major precondition for statistical literacy is transparent risk communication. We recommend using frequency statements instead of single-event probabilities, absolute risks instead of relative risks, mortality rates instead of survival rates, and natural frequencies instead of conditional probabilities. Psychological research on transparent visual and numerical forms of risk communication, as well as training of physicians in their use, is called for.

Statistical literacy is a necessary precondition for an educated citizenship in a technological democracy. Understanding risks and asking critical questions can also shape the emotional climate in a society so that hopes and anxieties are no longer as easily manipulated from outside and citizens can develop a better-informed and more relaxed

attitude toward their health.

The authors then explain why they have these concerns using studies they and others have carried out. Here is an example relevent to Chance News:

Do Journalists Help the Public to Understand Health Statistics?

The press has a powerful influence on public perceptions of

health and health care; much of what people—including many physicians—know and believe about medicine comes from the print and broadcast media. Yet journalism schools tend to teach everything except understanding numbers. Journalists generally receive no training in how to interpret or present medical research (Kees, 2002). A survey of health reporters at daily newspapers in five Midwestern states (70% response rate) found that over 80% had no training in covering health news or interpreting health statistics (Voss, 2002). Not surprisingly, few (15%) found it easy to interpret statistical data, and under a third found it easy to put health news in context. This finding is similar to that of a survey by the Freedom Forum, in which nearly half of the science writers agreed that ‘‘reporters have no idea how to interpret scientific results’’ (Hartz & Chappell, 1997).

The American Association for the Advancement of Science (AAAS) asked more than 1,000 reporters and public information officers what science news stories are most interesting to reporters, their supervisors, or news consumers (AAAS, 2006). The top science topic in the U.S. media is medicine and health, followed by stem cells and cloning, and psychology and neuroscience. In Europe, where national and local newspapers devote many more pages to covering science, topic number one is also medicine and health, followed by environment and climate change. Thus, a minimum statistical literacy in health would do journalists and their readers an excellent service.

Problems with the quality of press coverage, particularly in the reporting of health statistics about medical research, have been documented (Moynihan et al., 2000; Ransohoff & Harris, 1997; Rowe, Frewer, & Sjoberg, 2000; Schwartz, Woloshin, & Welch, 1999a). The most fundamental of these include failing to report any numbers, framing numbers in a nontransparent way to attract readers’ attention, and failing to report important cautions about study limitations.

No Numbers

http://www.dartmouth.edu/~chance/forwiki/table6.jpeg

As shown in Table 6, one disturbing problem with how the media report on new medications is the failure to provide quantitative data on how well the medications work. In the United States, Norway, and Canada, benefits were quantified in only 7%, 21%, and 20% of news stories about newly approved prescription medications, respectively. In place of data, many such news stories present anecdotes, often in the form of patients describing miraculous responses to a new drug. The situation is similar when it comes to the harms of medications: Typically less than half of stories name a specific side effect and even fewer

actually quantify it.

Nontransparent Numbers

Table 6 also demonstrates that when the benefits of a medication are quantified, they are commonly reported using only a relative risk reduction format without providing a base rate. Reporting relative risk reductions without clearly specifying the base rates is bad practice because it leads readers to overestimate the magnitude of the benefit. Consider one medication that lowers risk of disease from 20% to 10% and another that lowers it from 0.0002% to 0.0001%. Both yield a 50% relative risk reduction, yet they differ dramatically in clinical importance.

Sometimes there is another level of confusion: It is not clear whether a ‘‘percent lower’’ expression (e.g., ‘‘Drug X lowers the risk of heart attack by 10%’’) refers to a relative or an absolute risk reduction. To avoid this confusion, some writers express absolute risk reductions as ‘‘percentage points’’ (e.g., ‘‘Drug X reduced the risk of heart attack by 10 percentage points’’). This approach may be too subtle for many readers. The frequency format may make this distinction clearer (e.g., ‘‘For every 100 people who take drug X, 10 fewer will have a heart attack over 10 years’’). But the most important way to clarify risk reductions is to present the fundamental information about the absolute risks in each group (e.g., ‘‘Drug X lowered the risk of heart attack by 10 in 100: from 20 in 100 to 10 in 100 over 10 years’’).>br>

Harms are mentioned in only about one third of reports on newly approved medications, and they are rarely if ever quantified. While benefits are often presented in a nontransparent format, harms are often stated in a way that minimizes their salience. This is most dramatic in direct-to-consumer advertisements, which often display the relative risk reduction from the medication in prominent, large letters (without the base rate), but present harms in long lists in very fine print. TV ads typically give consumers more time to absorb information about benefits (typically qualitative claims about the drug, like ‘‘It worked for me’’) than about side effects, resulting in better recall of purported benefits (Kaphingst,DeJong, Rudd,&Daltroy, 2004; Kaphingst, Rudd, DeJong, & Daltroy, 2005). A second technique is to report benefits in relative risks (big numbers) and harms in absolute risks (small numbers). This asymmetry magnifies benefits and minimizes harm. A simple solution (again) is to present both benefits and harms in the same format—in absolute risks.

No Cautions

All studies have limitations. If the press is to help the public understand the inherent uncertainties in medical research, they should state the major limitations and important caveats. Unfortunately, this happens only rarely. In a content analysis of the high-profile media coverage of research presented at five scientific meetings (Woloshin&Schwartz, 2006b), few stories included cautions about studies with inherent limitations. For example only 10% of stories about uncontrolled studies noted that it was impossible to know if the outcome really related to the exposure.

These problems are a result not only of journalists’ lack of proper training but also of press releases themselves, including those from medical schools. Press releases are the most direct way that medical journals communicate with the media, and ideally they provide journalists with an opportunity to get their facts right. Unfortunately, however, press releases suffer from many of the same problems noted above with media coverage of medical news (Woloshin & Schwartz, 2002). They often fail to quantify the main effect (35% of releases), present relative risks without base rates (45% of those reporting on differences between study groups), and make no note of study limitations (77%). Although medical journals work hard to ensure that articles represent study findings fairly and acknowledge important limitations, their hard work is hence partially undone by the time research findings reach the news media. Better press releases could change this, helping journalists write better stories.

A few newspapers have begun to promote correct and transparent reporting in place of confusion and sensationalism. And there are a number of efforts to teach journalists how to understand what the numbers mean. In Germany, for example, one of us (GG) has trained some 100 German science writers, and in the United States there are MIT’s Medical Evidence Boot Camp and the Medicine in the Media program sponsored by the National Institutes of Health and the Dartmouth Institute for Health Policy and Clinical Practice’s Center for Medicine and the Media (where two of us, LS and SW, teach journalists from around the world.

Too many data analyses spoil the design

Data, Not Design, Is King in the Age of Google Miguel Helft, The New York Times, May 9, 2009.

If you are a web designer, it is very easy to get solid data on your design choices.

"The Web, of course, offers designers and innovators an unprecedented and powerful mechanism to test their ideas. They can mock something up, throw it up online, and get immediate feedback from users. Better yet, they can mock up multiple designs and test them quickly. Then, they can repeat the process until they home in on the design that seems to be most popular."

But one designer is rebelling against this data oriented approach. Douglas Bowman has quit his job at Google because they collected too much data.

Mr. Bowman's main complaint is that in Google's engineering-driven culture, data trumps everything else. When he would come up with a design decision, no matter how minute, he was asked to back it up with data. Before he could decide whether a line on a Web page should be three, four or five pixels wide, for example, he had to put up test versions of all three pages on the Web. Different groups of users would see different versions, and their clicking behavior, or the amount of time they spent on a page, would help pick a winner. 'Data eventually becomes a crutch for every decision, paralyzing the company and preventing it from making any daring design decisions,' Mr. Bowman wrote.

Google, of course, sees things differently.

"'We let the math and the data govern how things look and feel,' Marissa Mayer, the company’s vice president of search products and user experience, said in a recent television interview."

Some design experts agree with Mr. Bowman.

"'Getting virtually real-time feedback from users is incredibly powerful,' said Debra Dunn, an associate professor at the Stanford Institute of Design. 'But the feedback is not very rich in terms of the flavor, the texture and the nuance, which I think is a legitimate gripe among many designers.' Adhering too rigidly to a design philosophy guided by 'Web analytics,' Ms. Dunn said, 'makes it very difficult to take bold leaps.'

There is also support for Google's perspective.

"It is hard to criticize the results of Google's data-centric approach. The company is hugely successful. If a certain hue of blue causes users to click on ads at even a marginally higher rate, it can translate into millions of dollars flowing to the company’s bottom line."

The article goes on to discuss why the customer is not always right. Customer's do not always articulate clearly what they really want.

Everyone agrees, though, that some data collection is important.

"None of this means that input from users is unimportant. Indeed, Ms. Dunn, Mr. Brown and others say designers must find a multitude of ways to understand users’ needs at a deeper level. 'It is more from engaging with users, watching what they do, understanding their pain points, that you get big leaps in design,' Ms. Dunn said."

Questions

1. If the customer is not always right, what is the value of data collection?

2. What would be a good way to decide when to collect data and when to trust the instincts of the web designers?

3. Are there other areas besides web design where too much data collection can be a problem?

Submitted by Steve Simon

Sports Spending

NCAA report: College sports spending keeps skyrocketing

by Steve Wieberg and Steve Berkowitz, USA TODAY, April 29, 2009 [6]

According to a 2009 NCAA report, "Major college [athletics] programs increased their operating budgets by nearly 11% annually" over a 3-year span. The authors contrast this figure with a reported 4.9% average annual increase in universities' overall spending.

A 2005 version of the athletics spending study found

For every additional dollar spent, ... programs realized only an additional dollar in athletics revenue. And no correlation was found between increased spending and win-loss records.

The 2009 study estimates

An extra $1 million spent on football increases winning percentage by 1.8 percentage points and the chances of a top 25 finish in the Associated Press media poll by 5 percentage points .... In basketball, the study similarly finds "a significant relationship" between non-salary expenditures and both winning percentage and the probability of reaching the NCAA tournament.

The study finds no connection between coaches' salaries and winning. "The only category of spending that has a statistically significant effect on performance," the authors say, "is 'team expenditures' ." — recruiting, equipment and other "game-day expenses."

Submitted by Margaret Cibes

The Toughest Place to Win in Sports

by Darren Everson, The Wall Street Journal, March 4, 2009 [7]

The author reports that major men's college basketball teams have a winning percentage of .340 for away games.

Somewhere on Earth there may be a sport in which this figure is lower. But it isn't the NBA, NHL, American or Australian football, English or Argentine soccer, Major League Baseball, Japanese baseball, Dominican winter baseball, or any of two dozen other sports leagues.

According to Syracuse coach Jim Boeheim, "Only the top, top teams in the country can win on the road." The author cites a number of reasons for this, including the following.

One old theory is that this figure is skewed by all the games early in the year where the prominent schools invite weak opponents to their arenas .... But even if one ... counts only intraconference play, the resulting .380 road-winning figure is still below every other major U.S. team sport. In 2007, road teams in the Southeastern Conference lost 75% of their conference games – an only slightly worse record than this year's 68% rate, the second-highest of any Division I conference.

Another myth: that teams perform worse in enemy arenas because the unfamiliar sightlines affect their shooting. If that's true, then it would likely affect the women, too, who play in the same buildings. But the women's Division I road winning percentage this season is .378 – significantly above that of the men.

An "Interactive Graphic" link [8] provides road winning percentages for various sports leagues.

Submitted by Margaret Cibes