jump to navigation

“Statistically significant” (probably) doesn’t mean what you think it means January 18, 2016

Posted by Ezra Resnick in Science.

Statistical significance is a very important concept to understand when reading about scientific studies — and it’s also very likely to be defined incorrectly in news reports about scientific studies. For example, here is a Cancer Research UK “science blog” writing about a trial that investigated whether taking aspirin can lower the risk of cancer:

The researchers found that [those participants who were overweight and obese] had an even greater increase in bowel cancer risk, compared to those in the trial who were a healthy weight.

But among different sub-groups of people, some of these results weren’t ‘statistically significant’ (meaning there’s uncertainty over how valid they are)…

And here’s The News & Observer writing about the failed test of a new antiviral drug:

Shares of Chimerix dropped 81 percent and hit an all-time low Monday after the Durham drug developer announced that patients who took the company’s antiviral drug in a clinical trial died at higher rates than those who took a placebo…

Chimerix said the increased mortality rate among patients who took brincidofovir was not statistically significant, meaning that the deaths were not caused by the medication.

And here’s The Philadelphia Inquirer writing about the effect of delays in breast cancer treatment:

The risk of death increased by 9 percent or 10 percent … for patients with stage I and stage II breast cancers for each added 30-day interval, the Fox Chase researchers found.  In practice, the increased risk is small, because the chance of death at stage I or II is relatively low. But the difference is statistically significant, meaning it did not occur by chance, researchers found.

All those definitions of statistical significance (or insignificance) are significantly wrong. A result is considered statistically significant if the probability of it being “due to chance” is below some predetermined level, usually 5%. (Wikipedia has a more formal definition.) So even if a result is considered statistically significant, there’s still a probability of up to 5% that the effect measured was actually due to chance and we can’t learn anything from it. (Conversely, even if a result is deemed “insignificant”, there’s still a probability of up to 95% that it was not due to chance — those patients might have been killed by that antiviral drug after all.)

This means we should not put too much confidence in any single scientific result. In fact, out of the thousands of hypotheses published each year supported by “significant” results, we should expect that some will turn out to be wrong — science is hard. That’s why it’s so important to replicate scientific experiments multiple times, and to perform meta-analyses combining the results of many individual studies. Science is hard, and reporters who mislead their readers in the name of simplicity — or a catchy headline — can cause real damage.

Keep that in mind next time you read that green jelly beans cause acne.



No comments yet — be the first.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s