Don’t be duped by tricky statistics
We live in the information age; the Internet overflows with a limitless stream of data and statistics. Everyday newspapers and online media outlets spit out news stories based on statistics produced by various institutions. But it’s hard to know whether the data is true or false.
They say: “Numbers don’t lie.” But that doesn’t necessarily mean that all statistics show people the whole truth. In fact, they tend to mislead or deceive the number-believers by giving them half-baked truths.
The tricks vary from bad sampling to failure to prove causation between seemingly unrelated events used in statistics. And those numbers, which support only certain views, pervade society as media omit to tell people the hidden purposes and meanings behind them.
The secret language of statistics is often employed to sensationalize, inflate, confuse and oversimplify. However, people are unaccustomed to understanding the implications of numbers that media and governments could choose and use.
In this modern society, almost all policies are established by statistics. However, statistics and data, produced and processed by state-run and private institutes, could be misused or translated into wrong policies.
Tricks of statistics
Missing causation, which refers to factors involved in causing something, is one of the most commonly found errors in statistics. In an experiment, the way to establish causation is to manipulate one variable and measure its effects on another while controlling everything else possible.
“The classic example was in the late 60s/early 70s and has to do with the effects of television violence, because there is no way of conducting an experiment to determine if watching TV violence makes people aggressive,” said Lara Zwarun, communication professor at the University of Missouri-St. Louis. “You can’t control other factors such as how they were raised, the stresses of their day to day lives, their religion. So it’s hard to isolate what might be caused by TV, and therefore most results are correlational at best.”
The statistics about ice cream causing murder are another example. According to the study, ice cream sales did spike when murders did, making them correlated, but both were high because of a third causal variable, summer heat, Zwarun noted.
“Most recently, I saw an ad for the National Association of Realtors. They say in it that kids of people who own their homes have higher grades and self-esteem,” Zwarun said. “While that may be accurate in terms of correlation, I am sure it would be because people who are capable of buying a house are more stable, probably older, have better jobs, and have other characteristics that might lead them to have smarter or more capable kids.”
Bad sampling, the problem with the process of collecting data, is another common issue, especially in the market research industry.
A lot of market research is done online these days, and in many countries, you are not going to get a cross-section of the population if you can only reach people with Internet access, Zwarun said.
Wording is critical
The wording of questions and limited answer choices can manipulate answers out of respondents.
Jeong Nam-gu, currently a Tokyo correspondent for the local daily Hankyoreh, cites some examples in his 2008 book about the fallacies of statistics.
Supposing the question in a survey is: “Should South Korea provide greater amounts of food aid to the North?” if there were to be a preceding sentence such as “Large numbers of North Koreans are still starving to death,” it would elicit an answer in support of increased food aid.
However, if the preceding sentence were to be that “Most of the food aid delivered to the North goes to the military,” respondents would be more likely to answer against providing more aid.
Wording can be crucial especially when conducting opinion polls on important policies, because these polls can influence the formation of the latter.
But the results of surveys are often simplified to a certain percentage of people who agree and another percentage who disagree. The cited results do not specify exactly how the questions were phrased, or what percentage of the respondents actually took part in the survey.
Statistics literacy
Answers can also be manipulated by providing only a limited choice of answers which may not include a diversity of views.
In early 2008, a local news station conducted a survey on citizens on their opinion of the pan-Korea waterway project proposed by incumbent President Lee Myung-bak, which had been one of his key election pledges.
Although the question was neutral in asking about citizens’ opinion about the waterway project, the provided answer choices reflected a biased view.
The three answer choices were: “A general consensus among citizens needs to be reached prior to initiating the project;” “The project should be initiated as soon as the necessary preparations are made;” and “I am not sure.”
In the survey, 81 percent selected the first answer choice, 15 percent the second and 4 percent the third. Those supporting the project could select the first or the second answer, but interestingly, those against the plan were unable to select an answer that reflected their view because there were none. They either had to select the answer about reaching a consensus, or choose not to answer at all.
Biased analysis of survey results by institutions conducting them adds to the problem.
The results of a study comparing the academic achievements of elementary and middle school students of 42 countries were made public by a government agency in December last year.
There was little to argue about high-achieving Korean students, who ranked in the top three countries for achievements in math and science at both elementary and middle school levels.
But the agency had selected and made public survey results that supported the idea that competent and dedicated teachers are behind the academic excellence of Korean students. Meanwhile some items on the survey showed that Korean teachers were not necessarily putting more time and effort into researching teaching methods.
Kim Jin-ho, a professor of Korea National Defense University, said newsreaders need to learn the basic language of statistics because otherwise they will be at a greater risk of being deceived by concocted numbers or distorted data.
“The public is often fed with statistics cooked to the taste of institutions. Many statistics are meaningless or biased. And statistics usually demand that the public live an average life,” Kim said. “It’s wrong that they produce false statistics. What’s important is not mere numbers. We need to have clear standards of our own in order not to be influenced by false numbers and statistics.” <The Korea Times/Bahk Eun-ji, Jung Min-ho, Kim Bo-eun>