A Flawed Process Is At The Heart of Science and Journal Publications
Peer review is at the heart of the processes of not just medical journals but of all of science. It is the method by which grants are allocated, papers published, academics promoted, and Nobel prizes won. It has allowed government agencies to approve untold numbers of drugs and vaccines, or rubber stamp thousands of chemicals as safe. It has until recently been unstudied. And its defects are easier to identify than its attributes. Yet it shows no sign of going away.
When something is peer reviewed it is in some sense blessed. Even journalists recognize this. When the BMJ published a highly controversial paper that argued that a new ‘disease’, female sexual dysfunction, was in some ways being created by pharmaceutical companies, a friend who is a journalist was very excited–not least because reporting it gave him a chance to get sex onto the front page of a highly respectable but somewhat priggish newspaper (the Financial Times). ‘But,’ the news editor wanted to know, ‘was this paper peer reviewed?’. The implication was that if it had been it was good enough for the front page and if it had not been it was not.
WHAT IS PEER REVIEW?
Peer review is impossible to define in operational terms (an operational definition is one whereby if 50 of us looked at the same process we could all agree most of the time whether or not it was peer review). Peer review is thus like poetry, love, or justice. But it is something to do with a grant application or a paper being scrutinized by a third party–who is neither the author nor the person making a judgement on whether a grant should be given or a paper published. But who is a peer? Somebody doing exactly the same kind of research (in which case he or she is probably a direct competitor)? Somebody in the same discipline? Somebody who is an expert on methodology? And what is review? Somebody saying ‘The paper looks all right to me’, which is sadly what peer review sometimes seems to be. Or somebody pouring all over the paper, asking for raw data, repeating analyses, checking all the references, and making detailed suggestions for improvement? Such a review is vanishingly rare.
What is clear is that the forms of peer review are protean. Probably the systems of every journal and every grant giving body are different in at least some detail; and some systems are very different. There may even be some journals using the following classic system. The editor looks at the title of the paper and sends it to two friends whom the editor thinks know something about the subject. If both advise publication the editor sends it to the printers. If both advise against publication the editor rejects the paper. If the reviewers disagree the editor sends it to a third reviewer and does whatever he or she advises. This pastiche–which is not far from systems I have seen used–is little better than tossing a coin, because the level of agreement between reviewers on whether a paper should be published is little better than you’d expect by chance.
That is why Robbie Fox, the great 20th century editor of the Lancet, who was no admirer of peer review, wondered whether anybody would notice if he were to swap the piles marked ‘publish’ and ‘reject’. He also joked that the Lancet had a system of throwing a pile of papers down the stairs and publishing those that reached the bottom. When I was editor of the BMJ I was challenged by two of the cleverest researchers in Britain to publish an issue of the journal comprised only of papers that had failed peer review and see if anybody noticed. I wrote back ‘How do you know I haven’t already done it?’
DOES PEER REVIEW “WORK” AND WHAT IS IT FOR?
But does peer review ‘work’ at all? A systematic review of all the available evidence on peer review concluded that ‘the practice of peer review is based on faith in its effects, rather than on facts’. But the answer to the question on whether peer review works depends on the question ‘What is peer review for?’.
One answer is that it is a method to select the best grant applications for funding and the best papers to publish in a journal. It is hard to test this aim because there is no agreed definition of what constitutes a good paper or a good research proposal. Plus what is peer review to be tested against? Chance? Or a much simpler process? Stephen Lock when editor of the BMJ conducted a study in which he alone decided which of a consecutive series of papers submitted to the journal he would publish. He then let the papers go through the usual process. There was little difference between the papers he chose and those selected after the full process of peer review. This small study suggests that perhaps you do not need an elaborate process. Maybe a lone editor, thoroughly familiar with what the journal wants and knowledgeable about research methods, would be enough. But it would be a bold journal that stepped aside from the sacred path of peer review.
Another answer to the question of what is peer review for is that it is to improve the quality of papers published or research proposals that are funded. The systematic review found little evidence to support this, but again such studies are hampered by the lack of an agreed definition of a good study or a good research proposal.
Peer review might also be useful for detecting errors or fraud. At the BMJ we did several studies where we inserted major errors into papers that we then sent to many reviewers. Nobody ever spotted all of the errors. Some reviewers did not spot any, and most reviewers spotted only about a quarter. Peer review sometimes picks up fraud by chance, but generally it is not a reliable method for detecting fraud because it works on trust. A major question, which I will return to, is whether peer review and journals should cease to work on trust.
THE DEFECTS OF PEER REVIEW
So we have little evidence on the effectiveness of peer review, but we have considerable evidence on its defects. In addition to being poor at detecting gross defects and almost useless for detecting fraud it is slow, expensive, profligate of academic time, highly subjective, something of a lottery, prone to bias, and easily abused.
Slow and expensive
Many journals, even in the age of the internet, take more than a year to review and publish a paper. It is hard to get good data on the cost of peer review, particularly because reviewers are often not paid (the same, come to that, is true of many editors). Yet there is a substantial ‘opportunity cost’, as economists call it, in that the time spent reviewing could be spent doing something more productive–like original research. I estimate that the average cost of peer review per paper for the BMJ (remembering that the journal rejected 60% without external review) was of the order of 100, whereas the cost of a paper that made it right though the system was closer to 1000.
The cost of peer review has become important because of the open access movement, which hopes to make research freely available to everybody. With the current publishing model peer review is usually ‘free’ to authors, and publishers make their money by charging institutions to access the material. One open access model is that authors will pay for peer review and the cost of posting their article on a website. So those offering or proposing this system have had to come up with a figure–which is currently between $500-$2500 per article. Those promoting the open access system calculate that at the moment the academic community pays about $5000 for access to a peer reviewed paper. (The $5000 is obviously paying for much more than peer review: it includes other editorial costs, distribution costs–expensive with paper–and a big chunk of profit for the publisher.) So there may be substantial financial gains to be had by academics if the model for publishing science changes.
There is an obvious irony in people charging for a process that is not proved to be effective, but that is how much the scientific community values its faith in peer review.
People have a great many fantasies about peer review, and one of the most powerful is that it is a highly objective, reliable, and consistent process. I regularly received letters from authors who were upset that theBMJ rejected their paper and then published what they thought to be a much inferior paper on the same subject. Always they saw something underhand. They found it hard to accept that peer review is a subjective and, therefore, inconsistent process. But it is probably unreasonable to expect it to be objective and consistent. If I ask people to rank painters like Titian, Tintoretto, Bellini, Carpaccio, and Veronese, I would never expect them to come up with the same order. A scientific study submitted to a medical journal may not be as complex a work as a Tintoretto altarpiece, but it is complex. Inevitably people will take different views on its strengths, weaknesses, and importance.
So, the evidence is that if reviewers are asked to give an opinion on whether or not a paper should be published they agree only slightly more than they would be expected to agree by chance. (I am conscious that this evidence conflicts with the study of Stephen Lock showing that he alone and the whole BMJ peer review process tended to reach the same decision on which papers should be published. The explanation may be that being the editor who had designed the BMJ process and appointed the editors and reviewers it was not surprising that they were fashioned in his image and made similar decisions.)
Sometimes the inconsistency can be laughable. Here is an example of two reviewers commenting on the same papers.
Reviewer A: ‘I found this paper an extremely muddled paper with a large number of deficits’
Reviewer B: ‘It is written in a clear style and would be understood by any reader’.
This–perhaps inevitable–inconsistency can make peer review something of a lottery. You submit a study to a journal. It enters a system that is effectively a black box, and then a more or less sensible answer comes out at the other end. The black box is like the roulette wheel, and the prizes and the losses can be big. For an academic, publication in a major journal like Nature or Cell is to win the jackpot.
The Problem With Placebo
A placebo, by definition, is supposed to be an inert, innocuous substance that has no effect on your body. They are therefore used as a measure of control against which to measure the effects of modern-day medical treatments.
In fact, to win FDA approval a new drug must beat a placebo in at least two trials.
But there are some serious issues with the “double-blind, placebo-controlled, randomized clinical trial,” which is currently used as today’s gold standard of research.
First, simply taking a pill or receiving treatment (even if it’s a “fake”) has been known to prompt healing changes, and I’ll delve into this shortly. Second, as the Annals of Internal Medicine report showed, placebos are not always the inert substances they’re supposed to be.
What’s Really in Placebos?
The truth is, usually no one outside of the study’s researchers knows for sure.
In a study of 176 trials published in reputable medical journals, only 8 percent of those using pills for placebos disclosed the ingredients. Studies using placebo injections and other forms fared slightly better, with over 26 percent disclosing what the placebo was made of, but most still kept their placebos a secret.
This is a major omission in these studies, as ingredients in placebos can and do skew study results.
For instance, you may have seen headlines claiming a study found that consuming more omega-3 fats doesn’t help heart patients (even though it’s widely known that animal-based omega-3 fats (fish oil and krill oil) have amazing benefits for heart health).
In this case the study got incredibly flawed and misleading results because the researchers fed their volunteers margarine — either plain or enriched with plant- or animal-based omega-3 fats. Margarine is made by hydrogenation, and it is notorious for containing loads of heart-damaging trans fats.
There are newer trans-fat-free margarines available, and the study did not specify whether they were used or not, but they would still contain trace amounts of trans fats plus contain rancid vegetable oils, which are pro-inflammatory and therefore harmful to your heart.
These oxidized fats actually raise your risk of heart disease and blood clots, so in no way should heart-damaging margarine have been used as a placebo in a study looking for effects on your heart.
Despite its flawed design, this study was published in the New England Journal of Medicine, one of the most prestigious and well respected medical journals out there.
The Los Angeles Times pointed out one example of a study using a lactose placebo in a study with cancer patients. Lactose intolerance is common among cancer patients so the lactose placebo may have caused a lot of stomach problems — making the drug being tested look better for not causing those same problems.
So anytime a study uses a placebo with ingredients that are not disclosed — which happens in the vast majority of studies — the results are highly suspect.
The Placebo Effect: Is it Real?
Depending on whether a placebo is truly inert or not, it can produce very real biological changes in your body. This is especially true when the placebo contains potentially harmful ingredients, like the margarine example noted above. But assuming a placebo is inert, can it still impact your ability to heal?
Many illnesses, from Parkinson’s disease to irritable bowel syndrome, have been proven to improve after placebo pills and treatments. The jury is still out on whether the practice of taking a sugar pill or simply going through the ritual of treatment is what’s causing the beneficial responses…but either way studies show that if you think you’re receiving a treatment, and you expect that treatment to work, it often does.
“In recent decades reports have confirmed the efficacy of various sham treatments in nearly all areas of medicine. Placebos have helped alleviate pain, depression, anxiety, Parkinson’s disease, inflammatory disorders and even cancer.
Placebo effects can arise not only from a conscious belief in a drug but also from subconscious associations between recovery and the experience of being treated–from the pinch of a shot to a doctor’s white coat. Such subliminal conditioning can control bodily processes of which we are unaware, such as immune responses and the release of hormones.”
Placebo-Controlled Studies May be Next to Worthless…
It was over 50 years ago, in 1955, that anesthetist Henry Beecher’s paper “The Powerful Placebo” was published in The Journal of the American Medical Association. This was the first to bring up the very real fact that simply taking a pill or receiving treatment (even if it was “fake”) could prompt healing changes.It was after this paper was published that the Food, Drug and Cosmetic Act was amended to require drug trials to use placebo control groups, and the “double-blind, placebo-controlled, randomized clinical trial” that is still used today was a result of Henry Beecher’s work.
What you need to be aware of, though, is that many placebos being used in medical studies may not be of the inert or beneficial variety. It’s very possible for a placebo to contain suspect ingredients that cause their own set of health issues — issues that make the drug or other treatment being tested look better or safer than it actually is.
So unless a study discloses its placebo ingredients, and those ingredients are truly inert, a placebo-controlled study really can’t be trusted.
So when your conventional physician recommends the latest blockbuster drug or medical treatment, be sure you keep this in mind. Even if it’s been tested rigorously in a “double-blind, placebo-controlled, randomized clinical trial,” the results may not be what they seem.
The evidence on whether there is bias in peer review against certain sorts of authors is conflicting, but there is strong evidence of bias against women in the process of awarding grants. The most famous piece of evidence on bias against authors comes from a study by DP Peters and SJ Ceci. They took 12 studies that came from prestigious institutions that had already been published in psychology journals. They retyped the papers, made minor changes to the titles, abstracts, and introductions but changed the authors’ names and institutions. They invented institutions with names like the Tri-Valley Center for Human Potential. The papers were then resubmitted to the journals that had first published them. In only three cases did the journals realize that they had already published the paper, and eight of the remaining nine were rejected–not because of lack of originality but because of poor quality. Peters and Ceci concluded that this was evidence of bias against authors from less prestigious institutions.
This is known as the Mathew effect: ‘To those who have, shall be given; to those who have not shall be taken away even the little that they have’. I remember feeling the effect strongly when as a young editor I had to consider a paper submitted to the BMJ by Karl Popper. I was unimpressed and thought we should reject the paper. But we could not. The power of the name was too strong. So we published, and time has shown we were right to do so. The paper argued that we should pay much more attention to error in medicine, about 20 years before many papers appeared arguing the same.
The editorial peer review process has been strongly biased against ‘negative studies’, i.e. studies that find an intervention does not work. It is also clear that authors often do not even bother to write up such studies. This matters because it biases the information base of medicine. It is easy to see why journals would be biased against negative studies. Journalistic values come into play. Who wants to read that a new treatment does not work? That’s boring.
We became very conscious of this bias at the BMJ; we always tried to concentrate not on the results of a study we were considering but on the question it was asking. If the question is important and the answer valid, then it must not matter whether the answer is positive or negative. I fear, however, that bias is not so easily abolished and persists.
The Lancet has tried to get round the problem by agreeing to consider the protocols (plans) for studies yet to be done. If it thinks the protocol sound and if the protocol is followed, the Lancet will publish the final results regardless of whether they are positive or negative. Such a system also has the advantage of stopping resources being spent on poor studies. The main disadvantage is that it increases the sum of peer reviewing–because most protocols will need to be reviewed in order to get funding to perform the study.
There are several ways to abuse the process of peer review. You can steal ideas and present them as your own, or produce an unjustly harsh review to block or at least slow down the publication of the ideas of a competitor. These have all happened. Drummond Rennie tells the story of a paper he sent, when deputy editor of the New England Journal of Medicine, for review to Vijay Soman. Having produced a critical review of the paper, Soman copied some of the paragraphs and submitted it to another journal, the American Journal of Medicine. This journal, by coincidence, sent it for review to the boss of the author of the plagiarized paper. She realized that she had been plagiarized and objected strongly. She threatened to denounce Soman but was advised against it. Eventually, however, Soman was discovered to have invented data and patients, and left the country. Rennie learnt a lesson that he never subsequently forgot but which medical authorities seem reluctant to accept: those who behave dishonestly in one way are likely to do so in other ways as well.
TRUST IN SCIENCE AND PEER REVIEW
One difficult question is whether peer review should continue to operate on trust. Some have made small steps beyond into the world of audit. The Food and Drug Administration in the USA reserves the right to go and look at the records and raw data of those who produce studies that are used in applications for new drugs to receive licences. Sometimes it does so. Some journals, including the BMJ, make it a condition of submission that the editors can ask for the raw data behind a study. We did so once or twice, only to discover that reviewing raw data is difficult, expensive, and time consuming. I cannot see journals moving beyond trust in any major way unless the whole scientific enterprise moves in that direction.
So peer review is a flawed process, full of easily identified defects with little evidence that it works. Nevertheless, it is likely to remain central to science and journals because there is no obvious alternative, and scientists and editors have a continuing belief in peer review. How odd that science should be rooted in belief.