Over the past decade, scientists have increasingly become ashamed at the failings of their own profession: due to a lack of self-policing and quality control, a large proportion of studies have not been replicable, scientific frauds have flourished for years without being caught, and the pressure to publish novel findings—instead of simply good science—has become the commanding mantra. In what might be one of the worst such failings, a new study suggests that even systematic reviews and meta-analyses—typically considered the highest form of scientific evidence—are now in doubt.
The study comes from a single author: John Ioannidis, a highly respected researcher at Stanford University, who has built his reputation showing other scientists what they get wrong. In his latest work, Ioannidis contends that “the large majority of produced systematic reviews and meta-analyses are unnecessary, misleading, or conflicted.” John Ioannidis (Stanford) Systematic reviews and meta-analyses are statistically rigorous studies that synthesize the scientific literature on a given topic. If you aren’t a scientist or a policymaker, you may have never heard of them. But you have almost certainly been affected by them.
If you’ve ever taken a medicine for any ailment, you’ve likely been given the prescription based on systematic reviews of evidence for that medication. If you’ve ever been advised to use a standing desk to improve your health, it’s because experts used meta-analyses of past studies to make that recommendation. And government policies increasingly rely on conclusions stemming from evidence found in such reviews. “We put a lot of weight and trust on them to understand what we know and how to make decisions,” Ioannidis says.
But, he says, in recent years, systematic reviews and meta-analyses have often been wrong.
The race to publish
Ioannidis shows that there has been a startling rise in the number of such reviews published each year.
He notes that one reason for this trend may be the pent-up demand to understand the heaps of evidence accumulated in different scientific disciplines. Systematic reviews and meta-analyses weren’t much in use before the 1980s. As the number of scientists and the papers they’ve published has grown, with governments around the world investing more into basic research, the amount of evidence from single studies has been gathering pace too.
Systematic reviews and meta-analyses go a long way in helping us make sense of such evidence. They do not simply summarize what previous studies have shown. Instead, they scrutinize the evidence in each of the previous studies and draw a meaningful, statistically-valid conclusion on which way the data points. After their utility was shown in the 1980s, especially in the medical sciences, more and more researchers started publishing these sorts of reviews looking at niches within their own fields.However, Ioannidis argues that a lot of the studies published over the past few years are redundant. Many of these problematic papers, he says, are coming from China.
In the last decade, scientists from China have gone from publishing barely any meta-analyses to outstripping the US in their production. Though China’s investment in science has increased in that time, it alone does not explain the explosion in meta-analyses.
“If you look at the papers at face value, they seem to be very well done,” Ioannidis says. “But there is one critical flaw.”
A lot of the Chinese meta-analyses are in the field of genomics looking at candidate-gene studies. These studies often rely on genomic datasets collected as part of large health studies, involving tens of thousands of patients. Among such data, if enough people with a certain gene are found to suffer from a certain disease, the gene is linked to that disease and scientists can then publish a meta-analysis about the correlation.The problem is that such studies have been shown to be useless. “About 10 years ago, we saw that doing these studies addressing one or a few genes at a time was leading us nowhere,” Ioannidis says. “We realized that we had to look at the entire genome and combine data with careful planning from multiple teams and with far more stringent statistical rules.”
The vast majority of diseases are the result of the interaction between many genes and many environmental factors. In other words, cherry-picking information about one or a handful of genes has no practical use.
Chinese scientists know this as well as researchers anywhere, but they continue to publish these useless genomic meta-analyses because of a skewed incentive structure. These scientists are often evaluated on the basis of how many studies they’ve published rather than the quality of those studies. And candidate-gene meta-analyses are easy to do. “All you need is tables of people with and without the gene and whether they do and do not get disease,” Ioannidis says.
The science-industrial complex
If the problem were restricted to China and the field of genomics, it might not be that bad, since most genomics study can’t harm or improve people’s health. But other seriously problematic systematic reviews and meta-analyses have affected the course of public medicine. Consider the case of antidepressants.
Between 2007 and 2014, 185 systematic reviews were published on the use of antidepressants. About 30% (58) made some negative statement about antidepressants.The trouble is that, among those 58, only one had an author who was an employee at a pharmaceutical company at the time—despite the fact that 54 of the 185 total reviews (about 30%) had at least one industry author. That means, when an industry author contributed to a systematic review, the review is 22 times less likely to make a negative statement. “These systematic reviews have become a marketing tool.”
“We have a massive factory of industry-supported reviews that paint a picture of antidepressants being wonderful and easy-to-take,” Ioannidis says. “These systematic reviews have become a marketing tool.” And companies have emerged that take advantage of this tool. Over the past decade, Mapi Group, Abacus International, Evidera, and Precision for Value and others have begun to offer their services pharmaceutical industry to run meta-analyses for a fee.
In a study that is still under review, Ioannidis found that the vast majority of meta-analyses done by these contractors are never published. “If the paying customer doesn’t want to see the results because they are negative, the contractor doesn’t publish them,” Ioannidis says. This produces a skewed picture of the evidence, which is exactly what systematic reviews and meta-analyses are supposed to guard us against.
“There’s nothing wrong with such services from helping whoever is ready to pay to understand what the academic literature says about a subject,” says Malcolm Macleod, professor of neurology and translational neuroscience at Edinburgh University, where he also focuses on development and application of systematic reviews and meta-analyses. The concern, he says, is that it’s likely there are cases where “the question is asked in such a way that the answer they find is in the interest of whoever is sponsoring the analysis.”
Consider this example: A pharmaceutical company conducts a meta-analysis of a cancer drug and finds the overall outcome is slightly negative. However, if they eliminate data collected on Tuesdays, which because of statistical randomness had a slightly greater number of negative data points, the overall outcome of the meta-analysis shifts to being positive. This company can then employ an external company and pay them to conduct a meta-analysis without using the data collected on Tuesdays. When published, the meta-analysis would, as the company wanted, show a slightly positive outcome for the cancer drug.
Overall, Ioannidis found that only a tiny sliver of systematic reviews and meta-analyses—about 3%—that are both correct and useful.
“What Ioannidis has done is provide empirical evidence to support what has been a growing concern among those of us working in the field,” Macleod says. “To my knowledge, he is the first to lay bare the phenomenal increase in the proportion of bad reviews being done.”
What can we trust?
“We should be worried, but I wouldn’t just stop at the level of worry,” Ioannidis says. “There are things we can do to improve the situation.”
The vast majority of systematic reviews and meta-analyses are “retrospective,” which means researchers analyze data collected in the past and try to make sense of it. Retrospective reviews have serious limitations. For example, the past studies could be of varying quality and, even though they may ask the same question, they most likely will follow different protocols to collect their data. In such cases, the data sets from the many studies needed for the review would need to be rejiggered to make them comparable—but that can lead to less-than-perfect results. Or the authors of original studies may no longer be easily reachable—so the meta-researchers can’t track down details about the study that weren’t published but are crucial for the meta-analysis.
An initiative called Prospero was started to address these limitations. Prospero is a website where researchers can pre-register a review they are planning. “You make a research plan ahead of time,” Ioannidis says. “You think about how the studies are conducted and data collected. Then you start collecting the data, analyze it, and publish the systematic review.”
These types of reviews are known as “prospective” and they overcome many of the limitations of retrospective reviews. It ensures reviewers have compatible data sets and access to the authors of the studies they are reviewing for any further questions. Currently pre-registration on Prospero is voluntarily, but if it was required by the top journals, it could radically increase the number of prospective reviews that get published.
The other way to fix the problem is to fix the input. A systematic review is only as good as the studies fed into the analysis. If the studies have an overwhelming bias because most are, say, industry-sponsored or because negative studies have remained unpublished, then the systematic review will likely give an incomplete picture of the state of the science for the subject under review. Only a tiny sliver of systematic reviews and meta-analyses—about 3%—that are both correct and useful.
To fix the input problem, scientists need to become more transparent about their work. The AllTrials initiative is trying to achieve that. It has teamed up with the world’s leading health organizations and even some pharmaceutical companies to try to get a commitment from all those conducting clinical trials to publish the results of the trial regardless of their outcome. After years of campaigning, last week, the United Nations too joined the cause to ask governments to ensure that all clinical trials are published.
Another way to fix the input problem may be to make systematic reviews and meta-analyses into living documents, like Wikipedia pages. Each such page, Ioannidis suggests, would be managed by group independent researchers interested in the subject area. This way, instead of researchers publishing new reviews every few years, a consistent group of researchers will use standard methodology to update the living document in an ongoing manner.
The more science we produce, the greater our need for high-quality systematic reviews and meta-analyses. So, though the flaws Ioannidis has highlighted may now put the highest form of evidence in doubt, they also give scientists a chance to pull up their socks and fix what is needed to keep science moving forward.