Wednesday, November 3, 2010

Are there Statistic Deceptive in the way of Reading Things Critically?

How much should you be persuaded by the following passage? The adjective "Kafkaesque," exists in more than 250 different languages, suggesting that his work has had a major impact all over the world.

You should not be very impressed by the above reasoning. The argument deceives us with statistics!
One of the most frequent kinds of evidence that authors present is "statistics." You have probably often heard people use the following phrase to help support their argument: "I have statistics to prove it." We use statistics (often inappropriately) to rate the performance of a new movie, to measure the sales of a new product, to judge the moneymaking capabilities of certain stocks, to determine the likelihood of the next card's being the ace, to measure graduation rates for different colleges, to measure alcohol content in a given beverage, to record frequency of different groups' having sex, and to provide input for many other issues.

Statistics are evidence expressed as numbers. Such evidence can seem quite impressive because numbers make evidence appear to be very scientific and precise, as though it represents "the facts." Statistics, however, can, and often do, lie! They do not necessarily prove what they appear to prove.

As a critical thinker, you should strive to detect erroneous statistical reasoning. In a few short paragraphs, we cannot show you all the different waysthat people can "lie with statistics." However, this chapter will provide some general strategies that you can use to detect such deception. In addition, it will alert you to flaws in statistical reasoning by illustrating a number of the most common ways that authors misuse statistical evidence.

Question: Are the statistics deceptive?
Unknowable and Biased Statistics
The first strategy for locating deceptive statistics is to try to find out as much as you can about how the statistics were obtained. Can we know precisely the number of people in the United States who cheat on their taxes, have premarital sex, drink and drive, run red lights, use illegal drugs, hit a car in a parking lot and left without informing anyone, buy pornographic material, fail to rewind a video before returning it to the video store, or illegally download music? We suspect not. Why? Because there are a variety of obstacles to getting accurate statistics for certain purposes, including unwillingness to provide truthful information, failure to report events, and physical barriers to observing events. Consequently, statistics are often in the form of "educated
guesses." Such estimates can be quite useful; they can also be quite deceiving. Always ask, "How did the author arrive at the estimate?"


Confusing Averages
Examine the following statements:

  1. One way to make money fast is to become a professional golfer. The average professional golfer made $874,840.23 in tournament earnings alone in 2004.
  2. There is no reason to worry about the new nuclear power plant's being built in our city; the average amount of harm caused by nuclear accidents is rather low.


Both examples use the word "average." But there are three different ways to determine an average, and in most cases each will give you a different average. What are the three ways? One is to add all the values and divide this total by the number of values used. The result is the mean. A second way is to list all the values from highest to lowest, then find the one in the middle. This middle value is the median. Half of the values will be above the median; half will be below it. A third way is to list all the values and then count each different value or range of values. The value that appears most frequently is called the mode, the third kind of average.


It makes a big difference whether a writer is talking about the mean, median, or mode. Think about the winning distribution in any professional sport. Some individuals win extremely high amounts from tournaments, and these people tend to win many tournaments. Such high winnings will increase the mean dramatically. They will have litde effect, however, on either the median or the mode. Thus, if one wishes to make the average winnings seem high in this situation, the mean is probably the best average to present. For example, the highest golf winnings in 2004 was $10,905,166, substantially high enough to skew the mean. The median winnings in 2004 was $566,472, which is much lower than the mean. In fact, slighdy over one-third of the golfers won sums above the mean.

You should now be able to see how important it is to know which average is used when people talk about salaries or income.

Now, let's look carefully at example (2). If the average presented is either the mode or the median, we may be tricked into a false sense of security. For example, it is possible that many "small" accidents happen, causing very little damage. However, we would also want to know about larger accidents. How much damage is caused by these larger accidents, and how frequently do these larger accidents occur? These are all questions we would want to have answered before we feel secure with the new nuclear power plant. If there are a few very large accidents, but most are rather minor, the mode and the median nuclear accident values could be quite low, but the mean would be very high.

When you see "average" values, always ask: "Does it matter whether it is the mean, the median, or the mode?" To answer this question, consider how using the various meanings of average might change the significance of the information.

Not only is it important to determine whether an average is a mean, median, or mode, but it is often also important to determine the gap between the smallest and largest values—the range—and how frequently each of the values occurs—the distribution. For example, assume that you are at a casino and are trying to figure out which slot machine to play. Would you be satisfied with information about the average payout for each machine? We wouldn't.

We would want to know the range of payout, that is, the highest and the lowest cash winnings as well as the frequency of the different levels. The average might seem impressive, but if 15 percent of people end up losing
all of their money without winning once, we suspect that you would rather do something else with your money. Also, if the frequency of payoff of high money amounts if very low, you might second guess your choice of
slot machine.



Let's consider another example in which knowing the range and distribution would be important.

Engaging in premarital sex is not as dangerous as many people would have you believe. Nationwide, fewer than eight percent of people who have premarital sex end up with a sexually transmitted disease.

First, we suspect that this statistic represents the mean. While the mean number of people contracting STDs through premarital sex may be quite low, there are probably areas across the country where this number is much higher or much lower than the mean. In a certain area the risk might not be high, but it is likely to be much higher in other areas of the country. It would probably be important to know the range and frequency of different STDs that people having premarital sex contract. If of those eight percent, 25 percent contract
the HIV virus, you might not be so quick to believe the above argument's claim of the safely of premarital sex.

Thus, when an average is presented, ask yourself: "Would it be important for me to know the range and distribution of values?"

Concluding One Thing, Proving Another

Communicators often deceive us when they use statistics that prove one thing but then claim to have proved something quite different. The statistics don't prove what they seem to! We suggest two strategies for locating such deception.
One strategy is to blind yourself to the communicator's statistics and ask yourself, "What statistical evidence would be helpful in proving her conclusion?"

Then, compare the needed statistics to the statistics given. If the two do not match, you may have located a statistical deception. The following example provides you with an opportunity to apply that strategy.

A new weight-loss drug, Fatsaway, is effective in helping obese people lose weight. In a clinical trial, only 6 out of 100 people on Fatsaway reported any side effects with taking the drug. The company manufacturing the drug argues, "With 94 percent of people having positive results with Fatsaway, it is safe to say our pill is one of the most effective weight-loss pills in the market."

How should the company manufacturing the drug have proven its conclusion that Fatsaway is 94 percent effective as a weight-loss pill? Shouldn't they have performed a study as to how many people lost weight with the pill, and how much weight these people lost? Instead, the company reported statisticsregarding the frequency of side effects and has assumed that if the pill did not produce side effects then the pill was effective in helping them lose weight. The company proves one thing (relatively small number of people report side
effects with Fatsaway) and concludes another (Fatsaway is effective at helping people lose weight). An important lesson to learn from this example is to pay close attention to both the wording of the statistics and the wording of the conclusion to see whether they are referring to the same thing. When they are not, the author or speaker may be lying with statistics.

It is frequently difficult to know just what statistical evidence should be provided to support a conclusion. Thus, another strategy is to examine the author's statistics very closely while blinding yourself to the conclusion; then ask yourself, "What is the appropriate conclusion to be drawn from those statistics?"
Then, compare your conclusion with the author's. Try that strategy with the following example.

Almost half of all Americans cheat on their significant others. A researcher recently interviewed people at a shopping mall. Of the 75 people responding to the survey, 36 admitting to having cheated on someone they were "seeing."

Did you come up with the following conclusion? Almost half of the people in one given location admit to having cheated, at least once, on someone with whom they were dating or were otherwise involved. Do you see the difference between what the statistics proved and what the author concluded? If so, you have discovered how this author has lied with statistics.

Now, practice on the following.

A recent survey asked college students, "Have you ever had a night of binge drinking during the school year?" The researcher reported that 83 percent of college students answer "yes" and concluded, "The results demonstrate that universities are overly stressing their students, causing the students to engage in dangerous drinking habits to escape the pressures of college classes."

Do you see how the writer has concluded one thing while proving another? Do you think the results might have been different if the researcher had asked, "Do you drink to escape the stress from your college classes?"

Deceiving by Omitting Information

Statistics often deceive us because they are incomplete. Thus, a further helpful strategy for locating flaws in statistical reasoning is to ask, "What further information do you need before you can judge the impact of the statistics."

Let's look at two examples to illustrate the usefulness of this question.


  1. Large businesses are destroying the small town feel of our "downtown" area. Just last year, the number of large businesses in the city has increased by 75 percent.
  2. Despite common fears, skydiving is much safer than other activities, such as driving a car. In one particular month, in Los Angeles, 176 people died in car accidents while 3 died in skydiving accidents.


In the first example, 75 percent seems quite impressive. But something is missing: The absolute numbers on which this percentage is based. Wouldn't we be less alarmed if we knew that this increase was from four businesses to seven, rather than from 12 to 21? In our second example, we have the numbers, but we don't have the percentages. Wouldn't we need to know what these numbers mean in terms of percentages of  people involved in both activities? After all, there are fewer total skydivers than there are people traveling in cars.

When you encounter impressive-sounding numbers or percentages, be wary. You may need to get other information to decide just how impressive the numbers are. When only absolute numbers are presented, ask
whether percentages might help you make a better judgment; when only percentages are presented, ask whether absolute numbers would enrich their meaning.

Another important kind of potential missing information is relevant comparisons. It is often useful to ask the question, "As compared to . . . ?"

Each of the following statements illustrates statistics that can benefit from asking for comparisons:


  • Medusa hair spray, now 50 percent better.
  • SUVs are dangerous and should not be allowed on the road. In 2004, SUVs were responsible for 4,666 deaths. Certainly something needs to be done.
  • Movie budgets are outrageous nowadays. Just look at Star Wars: Revenge of the Sith, the budget for that movie alone is $115,000,000!

With reference to the first statement, don't you need to ask, "50 percent better than what?" Other ineffective hair sprays? Previous Medusa brand hair spray? As for the second statement, wouldn't you want to know how many of those deaths would have been prevented if an SUV were not involved, how many other motor vehicle fatalities not involving an SUV there were, the number of SUVs on the road compared to how many
deaths they were involved in, and how many miles SUVs travel compared to how many deaths occur in SUVs? With reference to the third statement, how does the budget of one particular movie relate to the budget of other movies, and is this one case highly unusual, or is it typical of the movie industry?

When you encounter statistics, be sure to ask, "What relevant information is missing?"

Risk Statistics and Omitted Information

"Daily use of Nepenthe brand aspirin will lower the chance of a second heart attack by 55 percent."
"Routine physicals have been linked to finding early cures and lowering people's likelihood of early death by 13 percent."
A common use of statistics in arguments—especially arguments about health risks—is the reporting of risk reduction as a result of some intervention. Such reports can be deceptive. The same amount of risk reduction can be reported in relative or absolute terms, and these differences can greatly affect our perceptions of the actual amount of risk reduction.

Imagine a 65-year-old woman who just had a stroke and is discussing treatment options with her doctor. The doctor quotes statistics about three treatment options:

  • Treatment X will reduce the likelihood of a future stroke by 33 percent,
  • Treatment Ywill reduce the risk by three percent, and
  • With treatment Z, 94 percent of women are free of a second stroke for10 years, compared to 91 percent of those who go untreated.

Which treatment should she choose? Our guess is that she will choose the first. But all of these options refer to the same size treatment effect. They just express the risk in different ways. The first (the 33 percent) is the "relative risk reduction." If a treatment reduces the risk of heart attack from 9 in 100 to 6 in 100, the risk is reduced by one-third, or 33 percent. But the absolute change, from 9 to 6 percent, is only a three percent reduction, and the improvement of a good outcome from 91 to 94 is also only three percent.

The point is that expressing risk reductions in relative, rather than absolute terms, can make treatment effects seem larger than they really are, and individuals are more likely to embrace a treatment when benefits are
expressed in relative rather than absolute terms. As you might expect, drug companies usually use relative risk in their ads, and media reports also tend to focus on relative risk.

Relative risk reduction statistics can be deceiving. When you encounter arguments using such statistics, always try to determine how the results might be different and less impressive if expressed in absolute terms.

Summary

We have highlighted a number of ways by which you can catch people "lying" with statistics. We hope that you can now see the problem with statistic about the widespread use of the term "Kafkaesque." Hints: Where did that impressive figure of more than 250 languages come from? Have Kafka's works been translated
into more than 250 languages?



Critical Question: Are the statistics deceptive
For each of the practice passages, identify inadequacies in the evidence.

Passage 1
Campaigns for national office are getting out of hand. Money is playing a central role in more and more elections. The average winner in a Senate race now spends over $8 million in his or her campaign, while typical presidential candidates spend more than $300 million. It is time for some serious changes, because
we cannot simply allow politicians to buy their seats through large expenditures on advertisements.

Passage 2
The home is becoming a more dangerous place to spend time. The number of home related injuries is on the rise. In 2000, approximately 2,300 children aged 14 and under died from accidents in the home. Also, 4.7 million people are bit by dogs each year. To make matters worse, even television, a relatively safe household appliance is becoming dangerous. In fact, 42,000 people are injured by televisions and television stands each year. With so many accidents in the home, perhaps people need to start spending more time outdoors.

Passage 3
Looking fashionable has never been easier! Every year the number of fashion designers increases by 8 percent, making a wider selection of fashionable objects available. Also, because the price of fashionable merchandise reflects the status the clothing brings with it, it is very easy to pick out the most fashionable
articles of clothing. Furthermore, the leading fashion magazines have improved in quality by 46 percent. Certainly anyone can look fashionable if he or she wishes to do so.


Sample Responses

Passage 1

CONCLUSION: A change in campaigning for national office is necessary.
REASON: Politicians are spending too much on campaigns. The average Senator spent more than $8 million on his or her campaign. Presidential candidates spend more than $300 million on their campaigns.

Are campaigns costing too much money? The word average and typical should alert us to a potential deception. We need to know the kind of average used for these statistics. Was it the mean, median, or the mode? For example, using the mean in the Senate race data could potentially lead to a figure that is skewed
because of certain, particularly close, Senate races where candidates spent large sums of money. However, because many Senators are basically guaranteed re-election, these races probably involve less spending. We know that only a few Senate race elections are usually close. Therefore, most probably do not spend as
much as was reported, if the mean was used to present the average. In other words, the median or the mode would probably show a lower value.

There are also important missing comparison figures. How does campaign spending compare to similar spending in the past? What about for other offices? It is possible that campaign spending has actually gone down in recent years.


Passage 2

CONCLUSION: It is becoming increasingly dangerous to spend time in one's home.
REASONS: 

  1. Household-related injuries are on the rise.
  2. In one year, 2,300 children died in household accidents.
  3. 4.7 million people are bit by dogs every year.
  4. 42,000 people are injured by televisions each year.
To evaluate the argument, we need to first determine what the most appropriate evidence is to answer the question, "Are households more unsafe than they used to be?" In our opinion, the best statistic to use to answer the preceding question is a comparison of the rate of serious household accidents per year now and the same statistic over the past. Also relevant is the number of injuries per hour spent in the house verses the same statistic for past years. It is possible that more household injuries occur because people are spending more time in their houses than they used to spend. If they are inside the house more, it is only logical that the number of injuries occurring in the house would also rise.

The evidence presented in the argument is questionable for a number of reasons. First, no number is given at all regarding the number of household injuries. We know the author says they are on the rise, but no evidence is provided demonstrating a rise. Second, no details are given regarding the deaths of children in household accidents. How does this statistic compare to children's deaths in the home in the past? What types of accidents are causing these children's deaths? Third, the number of dog bites is deceptive. We do not know whether these dog bites occur in the home. More importantly, the number of dog bites does not seem to move us toward the conclusion that being at home is unsafe. Fourth, the statistic regarding televisions is questionable. Where does the author get the impressive sounding statistic? Also, how serious are most of these injuries?

CRITICAL QUESTION SUMMARY:
WHY THIS QUESTION IS IMPORTANT

Are the Statistics Deceptive?

Authors often provide statistics to support their reasoning. The statistics appear to be hard evidence. However, there are many ways that statistics can be misused. Because problematic statistics are used frequently, it is important to identify any problems with the statistics so that you can more carefully determine whether you will accept or reject the author's conclusion.

0 comments: