Chance News 21

From ChanceWiki
Jump to navigation Jump to search

Quotations

I think you're begging the question, said Haydock, and I can see looming ahead one of those terrible exercises in probability where six men have white hats and six men have black hats and you have to work it out by mathematics how likely it is that the hats will get mixed up and in what proportion. If you start thinking about things like that, you would go round the bend. Let me assure you of that!

Agatha Christie
The Mirror Crack's

From the Probability Web Quotations

Forsooth

The first two Forsooths are from the October RRS NEWS.


Long-term, serious smokers have a 50% chance of dying.


Guardian Weekend

1 April 2006, p32.


The IOC Coordination Commission were told that 80 per cent of the land had already been acquired. London Mayor Ken Livingstone added that he was hoping that, by the time the public enquiry starts at the end of next month, four-fifths of the land would have been acquired.


Radio Oxford news report

20 April 2006

A car talk puzzle revisited

In Chance News 20 we asked for the answer to the following car talk puzzle (Week of 08-23-04)

The bullet holes were all over the place on the R.A.F. planes -- in the wings

and the fuselage, and seemingly distributed randomly on the undersides. So, where did the R.A.F. mathematician recommend extra armor, to save future

missions?

Car talk gave the following answer:

A nameless mathematician crawled underneath the planes and looked at where the

bullet holes were on the underside. They were all over the place as you might expect -- in the wings and the fuselage, and seemingly distributed randomly on the undersides. He studied hundreds of planes, took pictures, drew a number of sketches -- and then he made his recommendation.

His recommendation very simply was to armor plate the unhit areas that the returning planes had in common. When he surveyed the undersides of these planes, he noticed that there were a few spots that all of them had in common that had no bullet holes. And he had to assume that the ones that hadn't returned had

bullet holes in those locations. They were in the English Channel someplace.

Reader Mike Cox wrote us that the mathematician was the famous statistician Abraham Wald. Once he pointed this out we remembered that we had also discussed this in Chance news 7.05. This story is also told in "A selection of selection anomalies" Chance Magazine, Vol. 11, No. 2, Spring 1998, pp. 3-8, Howard Wainer, Samuel Palmer, and Eric T. Bradlow and in Howard's most recent book "Graphic Discovery: A Trout in the Milk and Other Visual Adventures". It is also discussed on the Digital Roan blog

These sources give as reference: "A Method of Estimating Plane Vullnerability Based on Damage of Survivors, Abraham Wald, Center for Naval Analyses, 423, July 1980. and also "Abraham Wald's Work on Aircraft Survivability, Marc Mangel;Fancisco J. Samaniego, Journal of the American Statistical Association,Vol. 79, No. 386, pp. 259-267.

We were surprised to find that neither of these references included the car talk story or the sketches promised in the car talk story. However, as the title suggests, Wald did discuss ways to assess the chance of a plane being shot down by hits at various parts of a plane, so it is plausible that he might have recommend armor plating only the unhit areas that the returning planes had in common. If anyone knows a reference for such a recommendation we would ask that you send it to us.

Here is a brief scetch of what Wald does reference sited. In his first chapter Wald assumes that we have the following data concerning planes participating in combat:

  • The total number N of planes participating in combat.
  • For any integer i (i = 0,1,2,...) the number of planes that received exactly i hits but have not been downed, i.e., have returned from combat.

Then he assumes that the probability that a plane will be shot down does not depend on the number of previous non-destructive hits. Under this assumption Wald shows how to compute the probability that a plane is shot down by the i'th hit.

Then later he devides the plane into in thre equal vulnerability areas to compare the vulnerability of these areas. He considers the following example: of 400 planes on a bombing mission, 359 return. Of these, 240 were not hit, 68 had one it, 29 had two hits, 12 had three hits, and 10 had four hits. Then he shows that the probabilities of being downed by a single hit are given by the following table.

Part Probability of being downed by a single hit.
Entire plane
.15
Engines
.39
Fuselage
.05
Fuel system
.15
Other parts
.02

Wald writes:

Thus for the observed data of this hypothetical example, the engine area is the most vulnerable in the sense that a hit there is most likely to down the plane. The fuselage has a relatively low vulnerability.

So the planes that were shot down were most likely hit the engine area and so it would make sense to reinforce this area as subjested by the solution to the car talk problem. Wald does say about one of his vulnerability tables: < blockquote>This can be used as guides for locating protection armor and can be used to make a prediction of the esitmated loss of a future mission.<\blockquote>

Estimating the diversity of dinosaurs

Proceedings of the National Academy of Sciences
Published online before print September 5, 2006
Steve C. Wang, and Peter Dodson

Fossil hunters told: Dig deeper
Philadelphia Inquirer, September 5, 2006
Tom Avril

Steve Wang is a statistician at Swarthmore College and Peter Dodson is a paleontologist at the University of Pennsylvania. Their study was widely reported in the media. You can find references to the media coverage and comments by Steve here.

In their paper the authors provided the following description of their results. Here are a few definitions that might be helpful: genera: a collective term used to incorporate like-species into one group, nonavian: not derived from birds, fossiliferous: containing a fossil, rock outcrop: the part of a rock formation that appears above the surface of the surrounding land

Despite current interest in estimating the diversity of fossil and extant groups, little effort has been devoted to estimating the diversity of dinosaurs. Here we estimate the diversity of nonavian dinosaurs at 1,850 genera, including those that remain to be discovered. With 527 genera currently described, at least 71% of dinosaur genera thus remain unknown. Although known diversity declined in the last stage of the Cretaceous, estimated diversity was steady, suggesting that dinosaurs as a whole were not in decline in the 10 million years before their ultimate extinction. We also show that known diversity is biased by the availability of fossiliferous rock outcrop. Finally, by using a logistic model, we predict that 75% of discoverable genera will be known within 60-100 years and 90% within 100-140 years. Because of nonrandom factors affecting the process of fossil discovery (which preclude the possibility of computing realistic confidence bounds), our estimate of diversity is likely to be a lower bound.

In this problem we have a sample of dinasaurs that lived on the earth. These dinasours are classified into groups called genera. We can count the number of each generus in our sample. From this we want to estimate the total number of dinasours that have roamed the earth. Many different methods for doing this have been developed and the authors of this study use one of the newer methods. We have discussed in previent Chance News other examples of this problem and it might help to discuss these briefly.

One of the first statistical studies of species was carried out by R.A. Fisher and illustrated in terms of determining the number of species of Malayan butterflies. His method is described in the paper 'The Relation Between the Number of Specis and the Number of Individuls in a Random Sample of an Animal Population', R.A. Fisher; A.Steven Corbet; C.B. Williams, The Journal of Animal Ecology, Vol. 12. No. 1, pp.442-58. (Available from Jstor).

Corbet provided the following data from his sampling of the Malyan butterflies:

n
observed
expected number
1
118
156.44
2
74
74.52
3
44
47.33
4
24
33.82
5
29
25.77
6
22
20.46
7
20
16.71
8
19
13.93
9
20
11.79
10
15
10.11
11
12
8.76
12
14
7.65
13
6
6.73
14
12
5.95
15
6
5.29
16
9
4.73
17
9
4.24
18
6
3.81
19
10
3.44
20
10
3.11
21
11
2.83
22
5
2.57
23
3
2.34
24
3
2.14

In this table n is the number of times a species occurs in the sample. The second column gives the number of species that occur n times in the sample. So we see that 118 species occurred once in the sample, 74 twice and 44 three times. The their column gives the expected number that occur n times suing Fisher's model which we will explain next. Thus the expected number for n = 1,2,3 are 156.44, 74.52 and 47.33.

Fisher model assumes that the number of times a species occurs in a sample has a poisson distribution:

<math>e^{-m}\frac{m^n}{n!} </math>

For a given species m is the expected number of this species that will occur in a sample. Since this will be expected to varie among the species Fisher treats this as a random variable. He chooses a distribution for m that leads him to estimate the expected number of species which appear n times in a random sample is given by

<math>\frac{\alpha}{n} x^n</math>

Here <math>\alpha</math> and x are parameters. If S is the number of species observed and N the the sample size \alpha and x can be determined as the values that satisfied the following two equations:

<math> S = -\alpha \log (1-x), \quad N = \alpha x/(1-x)</math>

From our data we find that S = 501 and N = 3306. Using these values we find that x = .95268 and <math>\alpha = 164.21.</math> These do not agree with the values obtained by the authors but we believe them to be correct.

Fisher was interested in finding a distribution that could approximate the distribution of the number of number of times a species in a sample occurred and the distribution that he proposed has been widely used in species studies. Another interesting question would be: can you estimate the total number of Malayan butterflies from a sample. This is what Wang and Dodson did in their study. One of the first to tackle this problem were I:.J. Good and G. H. Tollmin in their paper "The number of New Species, and the Increase in Population Coverage, when a Sample is Increased", Biometrika, Vol. 43, (June, 1956), pp. 45-63.

To be continued

Probability theory is not all that useful

Don’t box yourself in when making decisions, John Kay, Financial Times, 22 August 2006.

In this article, John Kay, a weekly columnist for the Financial Times, outlines a variation on the Monte Hall problem, to highlight that human minds are not well adapted to dealing with issues of probability.

Suppose there are only two boxes and one contains twice as much money as the other. When you choose one, you are shown that it contains £100. Will you stick with your original choice, or switch to the other box?

Kay shows that it is easy to apply this problem to real situations:

Anyone who has changed jobs, bought a house or planned a merger has encountered a version of the two-box game; keep what you know, or go for an uncertain alternative.

In this game, players can lose only £50 but might gain £100 and they have no way of judging whether the £50 loss is more or less likely than the £100 gain.

Decision theory predicts an expected gain of £25 from an equal chance of winning £100 or losing £50. But many people dislike the prospect of losing £50 more than they like the prospect of gaining £100. Reflecting his economic background, Kay goes on to speculate that this irrationality may explain why the equity premium in finance is so high – volatile assets need to show much higher returns to compensate for the pain of frequently seeing small losses.

Kay then outlines what he calls the 'fallacy of large numbers'

If you accepted 100 gambles like this, you are virtually certain to end up with a substantial gain. But, you may say, I am not playing this game 100 times. I am only playing it once and you cannot guarantee a gain in a single trial. That is true, but it illustrates “the fallacy of large numbers”. On the 100th trial, you are in the same position as someone who is offered the chance to do it once. So you should not do it the 100th time. But then you should not do it the 99th time, or the 98th – or the first.

He concludes that probability theory works well for a limited class of problems, but the real world is much more open-ended and there is usually fundamental uncertainty about both the nature of the outcomes and the process that gives rise to them.

Questions

  • Kay claims that 'the message of both the original Monty Hall problem and of this one is that, even in very simple cases, it is impossible to be certain that a particular mathematical representation of a real problem is a correct description. ... For people in business who rely on models and for people in financial services who must choose between boxes with uncertain contents every day, that is a disturbing conclusion.' Do you agree with his views? If yes, what might it imply for the teaching of probability?
  • Do you agree with Kay's comment that many people dislike the propect of losing £50 more than they like the prospect of gaining £100? This touches on the topic of behavioural economics, which has become a popular topic for study in recent years, especially since Daniel Kahneman was awarded a Nobel prize in economics 'for having integrated insights from psychological research into economic science, especially concerning human judgment and decision-making under uncertainty'. Does this comment help to resolve this paradox or is it irrelevant?
  • Repeating the game many times, increases the chance of achieving positive expected payoff. What hidden assumptions are being made here? (If a few more zeros were added to the payoffs, would your attitude be different?) Is there any merit in his 'fallacy of large numbers'? What is it about the 99th attempt that makes it different to the first attempt?
  • To benefit from the irrationality of the equity premium, Kay suggests to stop looking at share prices so often, so, in the long run, you will get the benefit of the higher return without the pain of observing volatility. Is there any merit in this suggestion? Does it suggest that rational investors, facing the same problem but with different time horizons, can logically reach very different conclusions?

Further reading

  • The topic in this article (and the next one) is often referred to as the envelope paradox(s) which have been discussed in [http:www.dartmouth.edu previous issues of Chance News,] going back to 1992. For example, Chance News 9.09 says 'A much more mysterious envelope paradox is the following: I put two distinct numbers a and b in an envelope. You pick one of the numbers at random. Show that you can decide if you have the bigger or smaller number with a probability greater than 1/2 of being correct. The envelope paradoxes are discussed in Chapter 4, examples 4.28 and 4.29 of the on-line probability book Introduction to Probability, Charles M. Grinstead and J. Laurie Snell, American Mathematical Society.
  • There are several webpages devoted to this difficult paradox. For example, Amos Storkey offers a Bayesian solution and logician Raymond Smullyan claims that the paradox can be restated without involving probability in his book Satan, Cantor and Infinity and Other Mind-boggling Puzzles. He does this by proving two contradictory propositions:
    • "Proposition 1. The amount that you will gain, if you do gain, is greater than the amount you will lose, if you do lose.
    • "Proposition 2. The amounts are the same."

Submitted by John Gavin.

Intuition is better than maths

The maths may be simple but intuition is more use, John Kay, Financial Times 29 August 2006.

In the two-box problem (see the 'Probability theory is not all that useful' article above), that puzzle offers you a choice of two boxes, one containing more money than the other. Once you have made a decision, you are shown what is in your preferred box. Do you stick with your original choice, or switch?

Kay claims a knowledge of probabilities seems to be a hindrance rather than a help because, with no other information, it seems no rational decision can be made. In this follow-up article, Kay advocates a strategy that seems better than always switching or always sticking, one that beats random choice even in a situation of almost total ignorance.

Before the game starts, focus on a sum of money. It does not matter what the amount is – say, £100. The 'threshold strategy' is to switch if the box you choose contains less than £100 and to stick if it contains more. The threshold strategy gives you a better-than-even chance of getting the larger sum. It does so for any value of the threshold you choose.

If both boxes have less than £100, or more than £100, then the probability that you get the larger sum from your random choice remains one-half. But if one box has less than £100 and the other has more than £100, adopting the threshold strategy makes sure you get the larger sum. Since there is at least a possibility that the amounts in the boxes lie in this range, the threshold strategy must increase your chance of winning.

Kay suggests that there is little benefit to choosing a threshold that is very high or very low, claiming that if you have some idea, however vague, about the range of possible contents, you can tweak the threshold strategy to your advantage. He also suggests choosing a threshold in the range of sums of money that would make a real difference to you. For example, if £20,000 would not transform your life but £50,000 would, then go for £50,000.

While admiting that the logic behind this solution is not straghtfoward but he claims that, intuitively, it makes sense. He claims that in real-life problems we typically adopt the principle of being realistic while looking for something that will make a difference.

In the version of the two-box problem, in the previous Chance article, where you had the additional information that one box contained twice as much money as the other, there seemed always to be an argument for switching; the potential gain is always twice the seemingly equally likely potential loss. But this conclusion is wrong. Once more, intuition runs ahead of our mathematical understanding, he concludes.

Questions

  • Do you agree with the solution outlined in this article?
  • Do you think Kay is right to be so skeptical about probability?

Submitted by John Gavin

You're Only as Old As You Think You Are

Gina Kolata is the New York Times health writer and by all accounts is at least several decades under 70 but that doesn't stop her from writing about geriatrics. Her October 5, 2006 article is entitled "Old but Not Frail: A Matter of Heart and Head." According to her experts, "you're only as old as you think you are. Rigorous studies are now showing that seeing, or hearing, gloomy nostrums about what it is like to be old can make people walk more slowly, hear and remember less well, and even affect their cardiovascular systems. Positive images of aging have the opposite effects. The constant message that old people are expected to be slow and weak and forgetful is not a reason for the full-blown frailty syndrome. But it may help push people along that path."

An epidemiologist at the National Institute on Aging, Eleanor Simonsick, "recruited 3,075 apparently healthy people in their 70's who said they could walk a quarter of a mile with no trouble and climb a flight of stairs. Each was asked to walk up and down a corridor 10 times, for a distance of a quarter mile, maintaining their pace and not stopping to rest." Although the average age of those who could and those who could not were identically 73, "A quarter of them could not do it." By the end of two years, "a third of the group that could walk the quarter mile said they were beginning to have difficulty."

These depressing results for those of us over 70 may be found in the May 3, 2006 issue of The Journal of the American Medical Association; "being unable to walk a quarter mile within five minutes portended troubles. For each minute beyond five, the risk of dying in the next four years increased by a third, the risk of having a heart attack increased by 20 percent, and the risk of having a disability increased by half." Further, "Those who took more than six minutes for the quarter-mike walk had the same risk of dying or having heart attack as those who could not walk the distance at all, and the effect was independent of age."

Despite beginning and ending her article with a focus on how elderly people's physical health depends on what others think of them, Kolata then does an about-face by posing a cause and effect question relating physical exercise: rather than frailty causing the inability to walk, "Could teaching people to walk further and faster prevent their growing so weak they could hardly walk?" A director of the National Institute on Aging claims that such a program to make more people mobile would save billions of dollars.

Discussion

1. Reconcile the following two phrases: (1) You are only as old as you think you are, and (2) Act your age.

2. A professor at the University of Pittsburgh states, " I would say all 100-year-old people are frail. Most 90-year-olds are frail. And some 80-year-olds are frail." Speculate on the solidness of her data.

3. What physical aspects of your life depend on your image as seen by others?

4. In your view, does frailty cause inability to walk, or does inability to walk cause frailty? Give some other examples of cause and effect where it is not clear which is the cause and which is the effect.

Submitted by Paul Alper

Conflicts of Interest

An aspect of statistical literacy that doesn't get enough appreciation is that of conflict of interest; it leads to surprising if not downright unbelievable results. The Washington Post had two articles which illustrate the concept. Shankar Vedantum on October 3, 2006 reported that "Schizophrenia patients do as well, or perhaps even better, on older psychiatric drugs compared with newer and far costlier medications" according to a study involving 227 randomly assigned patients reported in the Archives of General Psychiatry. This study was funded by the British government as opposed to the many other studies funded by the pharmaceutical industry which, of course, has a vested interest along with an alarmingly aggressive marketing arm.

Peter Jones is the leader of the government study; when searching for the appropriate word to characterize the lack of superiority of more expensive, newer drugs as portrayed in previous studies, he said, "'Duped' is not right...We were beguiled" by the previous studies. Jones further reflected on the importance of trusting data, rather than one's own judgment: "Sometimes the compass tells you go straight in front of you, but you somehow know it is wrong and that north is behind you...I have learned to follow the compass."

Annys Shin on October 5, 2006 in her column, The Checkout, deals with The Salt Conspiracy. "We've all been told to be dubious of studies paid for by Big Tobacco. Lately, we've learned it's hard to find an expert on depression or schizophrenia who isn't getting paid by Big Pharma. And now, CPSI [Center for Science in the Public Interest] is telling us, we have Big Salt." The Journal of the College of Nutrition had a supplement a few months previous claiming that salt is not bad for you, an argument that astonished many. Turns out the editor of the supplement, Dr. Alexander G. Logan, is, as alleged by CPSI, a "paid consultant to the salt industry" and failed to disclose his industry connections. "The articles were also not peer reviewed." In an epilogue to the Shin's article, an irate email from Logan to Shin states "I hold no ownership in any business related to the food or salt industries. I do not accept fees or honoraria for any consultative advice that I provide related to nutrition or salt."

A common conflict of interest exists at Division 1A universities. There the clash is between the academic and the athletic. The result is the so-called "student athlete." Naturally, many of these athletes should never be in college. "After high school, I thought I was done with school," said Gary Russell, a star running back with the University of Minnesota who flunked out of the university. According to the Minneapolis Star Tribune of October 2, 2006, "He earned only 28 credits out of a possible 51 and had a cumulative grade-point average of 1.829." This overstates his GPA because "He did receive three A's in Football coaching, Weight Training and Beginning Tennis." This is an excerpt from a much larger article which indicates that among the Big Ten schools, the University of Minnesota's "Graduation Success Rate" or GSR is not only last in football, but also in "men's basketball, women's basketball, men's cross country/track, men's golf, men's hockey and softball." Despite this dismal result, the head football coach and the men's athletic director each recently received a large raise in pay. The base salary for the latter is now at $315 thousand per year and for the former, $1.65 million per year; without resorting to any in-depth research, most statisticians earn far less. Those numbers do not include the usual perks bestowed on coaches and athletic directors. And, oh yes, the legislature voted this past session to build a new football stadium for its beloved Gophers. Naming rights belong to a local bank who in return for the $35 million contribution get to monopolize ATM devices on campus along with the privilege of accessing the student data base.

Discussion

1. The pharmaceutical industry has an obvious axe to grind. Does the British Government? The U.S. Food and Drug Administration?

2. The history of salt is fascinating and only in the modern era is salt inexpensive and prevalent. Mark Kurlansky is the author of Salt: A World History and makes for a good read.

3. Dr. Logan vigorously denies any monetary connection with the salt industry. Do you tend to side with him or with CPSI?

4. Are you surprised that Russell received the three As mentioned? What effect on the football team's performance will result from a new stadium?

Submitted by Paul Alper

The odds of dying

What are the odds of dying?, (US) The National Safety Council website.
Ways to Go, National Geographic, Siobhan Roth, August 2006.

Recent statistics from the National Safety Council reveal the most likely way to die, over the lifetime of a US resident. So this article's objective is to answer commonly-asked questions from the media such as, what are the odds of being killed by lightning? or what are the chances of dying in a plane crash? It includes common causes of death like car accidents as well as large scale catastrophes such as earthquakes.

For example, motorbike riding is more risky than playing with fireworks.

National Geographic magazine's August 2006 issue features this Ways to Go chart based on the National Safety Council's Odds of Dying statistics.

Questions

  • The National Safety Council website presents the data in tabular as well as graphical form. Which form of presentation do you think makes it easier to interpret the data?
  • In the previous edition on Chance, a similar article, Risk perceptions ranked different ways of dying in tabular form. Is a ranked table easier to read? Is it always more appropriate for this kind of data?
  • In the graph on the left, the probabilities appear to be proportional to the area of the circles. Is a two-dimensional, area-based scale appropriate to display on one-dimensional data like probabilities? Why use circles?
  • The National Safety Council website states that these odds are statistical averages over the whole U.S. population and do not necessarily reflect the chances of death for a particular person from a particular external cause. What factors might cause an individual's odds to deviate from the average?
  • The odds are based on data from 2003. Why might the odds change more over time for some categories more than others?

Submitted by John Gavin.

Death and taxes - 2007 edition

Death and taxes - 2007 edition, by Jesse Bachman, thebudgetgraph.com, 2006.
Amazing graph of how US income tax gets spent, BoingBoing website, Sep 18, 2006.

Continuing from the previous article with the theme of visualising data: an on-line, interactive graph neatly visualises how the US federal government spends its income taxes.

Death and Taxes - 2007, by Jesse Bachman.

Clicking on this image will open up a larger but static version of the graph. Whereas the original, interactive website version allows users to zoom in on different parts and to pan around the graph to see its full resolution, which isn't available with the static graphs associated with this article.

The data is broken into its constituent parts using a tree (or spider map) layout, with the area of each node on the tree representing the amount for each expenditure item.

Jesse Bachman, the graph's designer, says

the graph is the star here, it puts things into perspective. Its purpose is to generate questions.

Following comments from the public, the author later supplemented the original Death and Taxes graph with a simplier graph showing mandatory spending as well, which adds some perspective.

Graphic of the total budget emphasising the entitlement programs which are excluded from the 'Death and Taxes' graph, by Jesse Bachman.

Questions

  • Assuming you can work with the on-line version, in order to zoom in, does the graph put the data into perspective, especially the complexity and inter-connectedness of the various items? For example, how does the national debt compare to other items? Does the total budget is shown in the second graph offer a more informative perspective because it is less granular than the first graph?
  • Is the area of each node proportional to the magnitude of that item? What alternative relationships might be used to better visualise this information?
  • Do the various images within the graph improve its appearance or distract from the underlying data? Can you think of other ways to further improve the graph?
  • How well do you think the graph is researched and justified via supporting materials. What additional material would you like to see?
  • Can you find your favourite government department easily, such as the spending on science research? e.g. why is the National Science Foundation shown under military or national security related.

Futher reading

  • You can buy a poster ($24 plus $4.95 shipping to anywhere in the US) or send one to the US congress for ten dollars.

Submitted by John Gavin.

Alzheimer's

The medical moral of the story is simple: stay healthy so you don't need drugs. The statistical moral of the study is two-fold: (1) go to the original source instead of relying on what journalists write and (2) make sure that the right experiment is being done. The New England Journal of Medicine article of October 12, 2006 (Volume 355: 1525-1538 Number 15) entitled "Effectiveness of Atypical Antipsychotic Drugs in Patients with Alzheimer's Disease" received a fair amount of puzzling press review. The AP dispatch said "Symptoms did improve in about 30 percent of patients taking the drugs, as well as in 21 percent of those getting a placebo, partly because symptoms can naturally wax and wane." With 421 patients in the study and ignoring multiple comparisons (three drugs and a placebo), a quick calculation yields a p-value of .032 which is less than the magical .05 so it would seem that statistical significance was achieved. Yet, regarding clinical significance, the AP headline was "Anti-psychotic drugs don't ease dementia." Likewise, the headline for HealthDay's Amanda Gardner says "Antipsychotics No Better Than Placebo For Alzheimer's Patients." The headline for Denise Gellene's article in the Los Angeles Times was "Alzheimer's antipsychotic drugs 'are not the answer for most,' report says."

It turns out that while 421 patients were enrolled in 42 different sites (!), 82% quit their medications due to lack of efficacy, intolerability and other reasons; in short, these powerful drugs have unpleasant side effects such as death. According to an accompanying editorial in the NEJM by Jason Karlawish, this clinical trial is the right experiment as opposed to previous clinical trials involving drugs for Alzheimer's. Those studies have outcomes which "are measured at multiple time points" accruing "a large volume of data." These approaches increase "the likelihood that any change measured will be statistically, though not necessarily clinically, significant. Clearly, these designs are particularly valuable to companies seeking an FDA label or reprints of reports to distribute to clinicians."

Karlawish writes, "The primary end point in the study by Schneider et al. is an accurate reflection of a clinical event: the decision to change treatment because the patient's condition is worsening or not improving sufficiently."


Discussion

1. The NEJM article had 13 authors, 12 of whom listed "potential conflicts of interest." The only one not listed as having a potential conflict of interest was the person without a M.D. or Ph.D. degree. Speculate what this implies.

2. Define statistical significance. Define clinical significance. Does one imply the other?

3. Despite the negative results of this clinical trial, because some patients benefited, the lead author said, "The message from this study cannot be that the drugs are not useful." Is this a scientific or a religious belief?

4. All of the above refers to "Phase I." "Phase II" was not reported on in this article. In Phase II, those patients still willing to be randomly assigned "under double-blind conditions to receive one of the antipsychotic drugs to which they were not initially assigned" or to receive a drug not studied in Phase I. Why is this a good idea statistically? Why might this not be such a good idea for a patient?

Submitted by Paul Alper.

The human cost of war

Calculating casualties - The human cost of war in Iraq, The Economist, Oct 12th 2006.

A recent statistical study in the Lancet, a UK medical journal, claims that many more Iraqis have died in recent years than was previously thought, over 650,000 excess deaths. This is a follow-up study to a 2004 Lancet article on the same topic that was previously discussed in Chance News in 2004.

The Economist article mentions various techniques for counting deaths during a war: hospital death data, mortuary tallies and media reports, each producing different results. The Lancet study claims that the true figure is much higher than estimated by these other techniques. The Economist summaries the Lancet article's technique as one based on random sampling:

selecting small numbers of people at random allows statisticians to say something about the whole population. ... there is no reliable census data in Iraq, truly random selection is impossible because the list from which this would take place is incomplete. So the researchers used a technique that is called clustering.

The Economist article says that the researchers picked 47 neighbourhoods, each containing almost 40 households, at random and then surveying all the people living in them. They asked people whether anyone had died since January 1st 2002 and what had caused the death. They also asked to see death certificates and 92% of the reported deaths were certified.

They extrapolated the figures to reflect the national picture, saying Iraq's death rate had more than doubled since the invasion. By comparing the number of deaths before the fall of Saddam Hussein (5.5 per 1,000) to the latest figures (13.3 per 1,000 people a year), the authors claim an excess 650,000 deaths as a result of the invasion, which is 2.5 percent of the Iraqi population. The Economist cautions readers that data from individuals within a cluster are highly correlated so the 650,000 lies in the middle of a wide range: 390,000 and 943,000. The lead author suggested the media should not get too focused on the 655,000 number.

US President George Bush quickly dismissed the findings.

I don't consider it a credible report. Neither does Gen. (George) Casey (referring to the top ranking U.S. military official in Iraq) and neither do Iraqi officials.
The methodology is pretty well discredited.

The lead author defended his team's methodology, saying it was the standard used in developing countries to survey for HIV and other major health issues.

Questions

  • The survey teams consisted of two male and two female medical doctors fluent in Arabic and English. Does this add to the credibility of the figures? The results were rejected by the US president and one of his generals, does that detract from the credibility of the figures?
  • The Lancet's figures could be cross checked by comparing them to other public surveillance mechanisms. For example, UN reports sum Baghdad morgue and Ministry of Health data. This suggests that less than a tenth of Iraqis who were killed or injured, according to the Lancet paper, were noticed by other sources. Do you think that this degree is credible? How might the difference in estimates be explained? Is it plausible that biases in the Lancet suvery methodology or in the other competing data sources, or both, could account for magnitude of the differences in the estimates?

Further reading