Science The rise and fall of peer review - Why the greatest scientific experiment in history failed, and why that's a great thing

Adam Mastroianni
Dec 13

1671118515001.png
Photo cred: my dad

For the last 60 years or so, science has been running an experiment on itself. The experimental design wasn’t great; there was no randomization and no control group. Nobody was in charge, exactly, and nobody was really taking consistent measurements. And yet it was the most massive experiment ever run, and it included every scientist on Earth.
Most of those folks didn’t even realize they were in an experiment. Many of them, including me, weren’t born when the experiment started. If we had noticed what was going on, maybe we would have demanded a basic level of scientific rigor. Maybe nobody objected because the hypothesis seemed so obviously true: science will be better off if we have someone check every paper and reject the ones that don’t pass muster. They called it “peer review.”

This was a massive change. From antiquity to modernity, scientists wrote letters and circulated monographs, and the main barriers stopping them from communicating their findings were the cost of paper, postage, or a printing press, or on rare occasions, the cost of a visit from the Catholic Church. Scientific journals appeared in the 1600s, but they operated more like magazines or newsletters, and their processes of picking articles ranged from “we print whatever we get” to “the editor asks his friend what he thinks” to “the whole society votes.” Sometimes journals couldn’t get enough papers to publish, so editors had to go around begging their friends to submit manuscripts, or fill the space themselves. Scientific publishing remained a hodgepodge for centuries.

(Only one of Einstein’s papers was ever peer-reviewed, by the way, and he was so surprised and upset that he published his paper in a different journal instead.)
That all changed after World War II. Governments poured funding into research, and they convened “peer reviewers” to ensure they weren’t wasting their money on foolish proposals. That funding turned into a deluge of papers, and journals that previously struggled to fill their pages now struggled to pick which articles to print. Reviewing papers before publication, which was “quite rare” until the 1960s, became much more common. Then it became universal.

Now pretty much every journal uses outside experts to vet papers, and papers that don’t please reviewers get rejected. You can still write to your friends about your findings, but hiring committees and grant agencies act as if the only science that exists is the stuff published in peer-reviewed journals. This is the grand experiment we’ve been running for six decades.

The results are in. It failed.

A WHOLE LOTTA MONEY FOR NOTHIN’

Peer review was a huge, expensive intervention. By one estimate, scientists collectively spend 15,000 years reviewing papers every year. It can take months or years for a paper to wind its way through the review system, which is a big chunk of time when people are trying to do things like cure cancer and stop climate change. And universities fork over millions for access to peer-reviewed journals, even though much of the research is taxpayer-funded, and none of that money goes to the authors or the reviewers.
Huge interventions should have huge effects. If you drop $100 million on a school system, for instance, hopefully it will be clear in the end that you made students better off. If you show up a few years later and you’re like, “hey so how did my $100 million help this school system” and everybody’s like “uhh well we’re not sure it actually did anything and also we’re all really mad at you now,” you’d be really upset and embarrassed. Similarly, if peer review improved science, that should be pretty obvious, and we should be pretty upset and embarrassed if it didn’t.

It didn’t. In all sorts of different fields, research productivity has been flat or declining for decades, and peer review doesn’t seem to have changed that trend. New ideas are failing to displace older ones. Many peer-reviewed findings don’t replicate, and most of them may be straight-up false. When you ask scientists to rate 20th century discoveries that won Nobel Prizes, they say the ones that came out before peer review are just as good or even better than the ones that came out afterward. In fact, you can’t even ask them to rate the Nobel Prize-winning discoveries from the 1990s and 2000s because pretty much nothing from that period has won a Nobel Prize.

Of course, a lot of other stuff has changed since World War II. We did a terrible job running this experiment, so it’s all confounded. All we can say from these big trends is that we have no idea whether peer review helped, it might have hurt, it cost a ton, and the current state of the scientific literature is pretty abysmal. In this biz, we call this a total flop.

POSTMORTEM

What went wrong?
Here’s a simple question: does peer review actually do the thing it’s supposed to do? Does it catch bad research and prevent it from being published?
It doesn’t. Scientists have run studies where they deliberately add errors to papers, send them out to reviewers, and simply count how many errors the reviewers catch. Reviewers are pretty awful at this. In this study reviewers caught 30% of the major flaws, in this study they caught 25%, and in this study they caught 29%. These were critical issues, like “the paper claims to be a randomized controlled trial but it isn’t” and “when you look at the graphs, it’s pretty clear there’s no effect” and “the authors draw conclusions that are totally unsupported by the data.” Reviewers mostly didn’t notice.

In fact, we’ve got knock-down, real-world data that peer review doesn’t work: fraudulent papers get published all the time. If reviewers were doing their job, we’d hear lots of stories like “Professor Cornelius von Fraud was fired today after trying to submit a fake paper to a scientific journal.” But we never hear stories like that. Instead, pretty much every story about fraud begins with the paper passing review and being published. Only later does some good Samaritan—often someone in the author’s own lab!—notice something weird and decide to investigate. That’s what happened with this this paper about dishonesty that clearly has fake data (ironic), these guys who have published dozens or even hundreds of fraudulent papers, and this debacle:

1671118557535.png
Tweet (Archive)
1671118607440.png

Why don’t reviewers catch basic errors and blatant fraud? One reason is that they almost never look at the data behind the papers they review, which is exactly where the errors and fraud are most likely to be. In fact, most journals don’t require you to make your data public at all. You’re supposed to provide them “on request,” but most people don’t. That’s how we’ve ended up in sitcom-esque situations like ~20% of genetics papers having totally useless data because Excel autocorrected the names of genes into months and years.

(When one editor started asking authors to add their raw data after they submitted a paper to his journal, half of them declined and retracted their submissions. This suggests, in the editor’s words, “a possibility that the raw data did not exist from the beginning.”)

The invention of peer review may have even encouraged bad research. If you try to publish a paper showing that, say, watching puppy videos makes people donate more to charity, and Reviewer 2 says “I will only be impressed if this works for cat videos as well,” you are under extreme pressure to make a cat video study work. Maybe you fudge the numbers a bit, or toss out a few outliers, or test a bunch of cat videos until you find one that works and then you never mention the ones that didn’t. 🎶 Do a little fraud // get a paper published // get down tonight 🎶

PEER REVIEW, WE HARDLY TOOK YE SERIOUSLY

Here’s another way that we can test whether peer review worked: did it actually earn scientists' trust?

Scientists often say they take peer review very seriously. But people say lots of things they don’t mean, like “It’s great to e-meet you” and “I’ll never leave you, Adam.” If you look at what scientists actually do, it’s clear they don’t think peer review really matters.

First: if scientists cared a lot about peer review, when their papers got reviewed and rejected, they would listen to the feedback, do more experiments, rewrite the paper, etc. Instead, they usually just submit the same paper to another journal. This was one of the first things I learned as a young psychologist, when my undergrad advisor explained there is a “big stochastic element” in publishing (translation: “it’s random, dude”). If the first journal didn’t work out, we’d try the next one. Publishing is like winning the lottery, she told me, and the way to win is to keep stuffing the box with tickets. When very serious and successful scientists proclaim that your supposed system of scientific fact-checking is no better than chance, that’s pretty dismal.

Second: once a paper gets published, we shred the reviews. A few journals publish reviews; most don't. Nobody cares to find out what the reviewers said or how the authors edited their paper in response, which suggests that nobody thinks the reviews actually mattered in the first place.

And third: scientists take unreviewed work seriously without thinking twice. We read “preprints” and working papers and blog posts, none of which have been published in peer-reviewed journals. We use data from Pew and Gallup and the government, also unreviewed. We go to conferences where people give talks about unvetted projects, and we do not turn to each other and say, “So interesting! I can’t wait for it to be peer reviewed so I can find out if it’s true.”

Instead, scientists tacitly agree that peer review adds nothing, and they make up their minds about scientific work by looking at the methods and results. Sometimes people say the quiet part loud, like Nobel laureate Sydney Brenner:
I don’t believe in peer review because I think it’s very distorted and as I’ve said, it’s simply a regression to the mean. I think peer review is hindering science. In fact, I think it has become a completely corrupt system.

CAN WE FIX IT? NO WE CAN'T

I used to think about all the ways we could improve peer review. Reviewers should look at the data! Journals should make sure that papers aren’t fraudulent!

It’s easy to imagine how things could be better—my friend Ethan and I wrote a whole paper on it—but that doesn’t mean it’s easy to make things better. My complaints about peer review were a bit like looking at the ~35,000 Americans who die in car crashes every year and saying “people shouldn’t crash their cars so much.” Okay, but how?

Lack of effort isn’t the problem: remember that our current system requires 15,000 years of labor every year, and it still does a really crappy job. Paying peer reviewers doesn’t seem to make them any better. Neither does training them. Maybe we can fix some things on the margins, but remember that right now we’re publishing papers that use capital T’s instead of error bars, so we’ve got a long, long way to go.

What if we made peer review way stricter? That might sound great, but it would make lots of other problems with peer review way worse.
For example, you used to be able to write a scientific paper with style. Now, in order to please reviewers, you have to write it like a legal contract. Papers used to begin like, “Help! A mysterious number is persecuting me,” and now they begin like, “Humans have been said, at various times and places, to exist, and even to have several qualities, or dimensions, or things that are true about them, but of course this needs further study (Smergdorf & Blugensnout, 1978; Stikkiwikket, 2002; von Fraud et al., 2018b)”.

This blows. And as a result, nobody actually reads these papers. Some of them are like 100 pages long with another 200 pages of supplemental information, and all of it is written like it hates you and wants you to stop reading immediately. Recently, a friend asked me when I last read a paper from beginning to end; I couldn’t remember, and neither could he. “Whenever someone tells me they loved my paper,” he said, “I say thank you, even though I know they didn’t read it.” Stricter peer review would mean even more boring papers, which means even fewer people would read them.

Making peer review harsher would also exacerbate the worst problem of all: just knowing that your ideas won’t count for anything unless peer reviewers like them makes you worse at thinking. It’s like being a teenager again: before you do anything, you ask yourself, “BUT WILL PEOPLE THINK I’M COOL?” When getting and keeping a job depends on producing popular ideas, you can get very good at thought-policing yourself into never entertaining anything weird or unpopular at all. That means we end up with fewer revolutionary ideas, and unless you think everything’s pretty much perfect right now, we need revolutionary ideas real bad.

On the off chance you do figure out a way to improve peer review without also making it worse, you can try convincing the nearly 30,000 scientific journals in existence to apply your magical method to the ~4.7 million articles they publish every year. Good luck!

PEER REVIEW IS WORSE THAN NOTHING; OR, WHY IT AIN’T ENOUGH TO SNIFF THE BEEF

Peer review doesn’t work and there’s probably no way to fix it. But a little bit of vetting is better than none at all, right?

I say: no way.

Imagine you discover that the Food and Drug Administration’s method of “inspecting” beef is just sending some guy (“Gary”) around to sniff the beef and say whether it smells okay or not, and the beef that passes the sniff test gets a sticker that says “INSPECTED BY THE FDA.” You’d be pretty angry. Yes, Gary may find a few batches of bad beef, but obviously he’s going to miss most of the dangerous meat. This extremely bad system is worse than nothing because it fools people into thinking they’re safe when they’re not.
That’s what our current system of peer review does, and it’s dangerous. That debunked theory about vaccines causing autism comes from a peer-reviewed paper in one of the most prestigious journals in the world, and it stayed there for twelve years before it was retracted. How many kids haven’t gotten their shots because one rotten paper made it through peer review and got stamped with the scientific seal of approval?

If you want to sell a bottle of vitamin C pills in America, you have to include a disclaimer that says none of the claims on the bottle have been evaluated by the Food and Drug Administration. Maybe journals should stamp a similar statement on every paper: “NOBODY HAS REALLY CHECKED WHETHER THIS PAPER IS TRUE OR NOT. IT MIGHT BE MADE UP, FOR ALL WE KNOW.” That would at least give people the appropriate level of confidence.

SCIENCE MUST BE FREE

Why did peer review seem so reasonable in the first place?
I think we had the wrong model of how science works. We treated science like it’s a weak-link problem where progress depends on the quality of our worst work. If you believe in weak-link science, you think it’s very important to stamp out untrue ideas—ideally, prevent them from being published in the first place. You don’t mind if you whack a few good ideas in the process, because it’s so important to bury the bad stuff.

But science is a strong-link problem: progress depends on the quality of our best work. Better ideas don’t always triumph immediately, but they do triumph eventually, because they’re more useful. You can’t land on the moon using Aristotle’s physics, you can’t turn mud into frogs using spontaneous generation, and you can’t build bombs out of phlogiston. Newton’s laws of physics stuck around; his recipe for the Philosopher’s Stone didn’t. We didn’t need a scientific establishment to smother the wrong ideas. We needed it to let new ideas challenge old ones, and time did the rest.

If you’ve got weak-link worries, I totally get it. If we let people say whatever they want, they will sometimes say untrue things, and that sounds scary. But we don’t actually prevent people from saying untrue things right now; we just pretend to. In fact, right now we occasionally bless untrue things with big stickers that say “INSPECTED BY A FANCY JOURNAL,” and those stickers are very hard to get off. That’s way scarier.

Weak-link thinking makes scientific censorship seem reasonable, but all censorship does is make old ideas harder to defeat. Remember that it used to be obviously true that the Earth is the center of the universe, and if scientific journals had existed in Copernicus’ time, geocentrist reviewers would have rejected his paper and patted themselves on the back for preventing the spread of misinformation. Eugenics used to be hot stuff in science—do you think a bunch of racists would give the green light to a paper showing that Black people are just as smart as white people? Or any paper at all by a Black author? (And if you think that’s ancient history: this dynamic is still playing out today.) We still don’t understand basic truths about the universe, and many ideas we believe today will one day be debunked. Peer review, like every form of censorship, merely slows down truth.

HOORAY WE FAILED

Nobody was in charge of our peer review experiment, which means nobody has the responsibility of saying when it’s over. Seeing no one else, I guess I’ll do it:
We’re done, everybody! Champagne all around! Great work, and congratulations. We tried peer review and it didn’t work.

Honesty, I’m so relieved. That system sucked! Waiting months just to hear that an editor didn’t think your paper deserved to be reviewed? Reading long walls of text from reviewers who for some reason thought your paper was the source of all evil in the universe? Spending a whole day emailing a journal begging them to let you use the word “years” instead of always abbreviating it to “y” for no reason (this literally happened to me)? We never have to do any of that ever again.

I know we all might be a little disappointed we wasted so much time, but there's no shame in a failed experiment. Yes, we should have taken peer review for a test run before we made it universal. But that’s okay—it seemed like a good idea at the time, and now we know it wasn’t. That’s science! It will always be important for scientists to comment on each other’s ideas, of course. It’s just this particular way of doing it that didn’t work.

What should we do now? Well, last month I published a paper, by which I mean I uploaded a PDF to the internet. I wrote it in normal language so anyone could understand it. I held nothing back—I even admitted that I forgot why I ran one of the studies. I put jokes in it because nobody could tell me not to. I uploaded all the materials, data, and code where everybody could see them. I figured I’d look like a total dummy and nobody would pay any attention, but at least I was having fun and doing what I thought was right.
Then, before I even told anyone about the paper, thousands of people found it, commented on it, and retweeted it.

1671118790940.png
Tweet (Archive)
1671118801765.png
Tweet (Archive)
1671118781870.png
Tweet (Archive)

Total strangers emailed me thoughtful reviews. Tenured professors sent me ideas. NPR asked for an interview. The paper now has more views than the last peer-reviewed paper I published, which was in the prestigious Proceedings of the National Academy of Sciences. And I have a hunch far more people read this new paper all the way to the end, because the final few paragraphs got a lot of comments in particular. So I dunno, I guess that seems like a good way of doing it?

I don’t know what the future of science looks like. Maybe we’ll make interactive papers in the metaverse or we’ll download datasets into our heads or whisper our findings to each other on the dance floor of techno-raves. Whatever it is, it’ll be a lot better than what we’ve been doing for the past sixty years. And to get there, all we have to do is what we do best: experiment.

Source (Archive)
 
On the subject of the Sokal Affair; the editors of the journal did not seek any peer review whatsoever

Whether Sokal's article would have been declared substandard by a physicist peer reviewer is debatable (it is not, after all, a scholarly contribution to the discipline of physics) but not finally relevant to us--at least not according to the criteria we employed.

Modern academia has been reduced to a selfserving asskissing circus, if anything. Saying the Right Thing will get you published quicker than simply stating facts or observations. The reason social sciences are in such disrepair has less to do with authors having anything of value to contribute (this is a financial way of thinking which lacks the vision of permutative discovery) and more to do with the undercurrent of political soapboxing that has been let in through the door for the past tree or so decades.

The authors of this post would like to thank I. P. Loudley for his invaluable contribution to the seat warming sciences of academic departments, B. O. Prablim for his dedicated research into the field of non-discreet status celebration, and Kan Eet Fayil, for his steadfast dedication to belong in as many university overseeing committees as physically possible.
 
Last edited:
I used to work with a respected climate scientist. He submitted a paper almost perfectly modelling the climate data of the last 50 years to solar and orbita effects. That not only didn't get anywhere near a journal but basically blacklisted him from every journal.

It wasn't because there was any "bad" science in it, it all made sense all the analysis was right there, it just had conclusions that were seen as heresy by the powers that be. Peer review protects "accepted" ideas and prevents anyone from even challenging them.
 
Peer review is racist anyways.
Funny you should mention that.

TL,DR: Race scholar complains about racism in the peer review process, gets the editor who reviewed his paper fired.

I used to work with a respected climate scientist. He submitted a paper almost perfectly modelling the climate data of the last 50 years to solar and orbita effects. That not only didn't get anywhere near a journal but basically blacklisted him from every journal.

It wasn't because there was any "bad" science in it, it all made sense all the analysis was right there, it just had conclusions that were seen as heresy by the powers that be. Peer review protects "accepted" ideas and prevents anyone from even challenging them.
Academia has always been very status-conscious. I've had papers I wrote get rejected because I didn't kiss the ass of one of the more prominent members of the field enough. He was one of my reviewers. Adding in a few references and genuflections to him got the paper published.
 
I used to work with a respected climate scientist. He submitted a paper almost perfectly modelling the climate data of the last 50 years to solar and orbita effects. That not only didn't get anywhere near a journal but basically blacklisted him from every journal.

It wasn't because there was any "bad" science in it, it all made sense all the analysis was right there, it just had conclusions that were seen as heresy by the powers that be. Peer review protects "accepted" ideas and prevents anyone from even challenging them.
What were the conclusions?
 
Anyone reading the actual studies into controversial subjects should instant realize why studies can be bullshit.
And conferences now often have activists mansplaining to the attending researchers how harmful the wrong conclusions can be.

As someone who used to be in academia I don't trust academia anymore.
 
I'm a bit torn. On one hand, peer reviews are (were) a way to keep out the dross from being published. On the other hand, people have corrupted the system for their own ends as they have with regular academia - the results must match the hypothesis (if a company or particular sponsor is funding the project), or the results will be widely disregarded by the community at large and decried as false information.

Hell, the peer review system didn't stop a guy from submitting excerpts of Mein Kampf (but with a few phrasings cleverly edited to appeal to the journal's sociology lean).
 
What were the conclusions?

Basically it was most of the changes to climate in the last 50 years could be attributed to a combination of the earth's orbital precession (summer in the northern hemisphere coming closer to the point where the earth it's at its closest to the sun), and the more active parts of the 22 year sun cycle of activity happening to coincide. That there was some effect due to greenhouse gasses but nowhere near as much as generally thought.

Like I say none of it was bad data, analysis or done with ideology. I just think the guy was too naieve to what trying to report those sorts of findings, questioning climate dogma, would do to his career these days. Luckily he was like 5 years from retirement so just took early retirement and left all the bullshit behind.
 
Peer review is racist anyways.
It's so racist that it's the reason a traditional Polynesian drink wasn't banned in the US while Europe banned it left and right. Turned out all the liver issues being blamed on kava as a whole were actually due to bad products that used parts of the root no one in the Pacific ever used, using an actually toxic wild variant of the domesticated plant, and people being dumb at raves.
 
There's a reason it's often nicknamed "pal review". Getting published in a journal is a lot easier when you're part of the clique that gets picked to review all the papers.

I uploaded all the materials, data, and code where everybody could see them.

This is the most important part. So many papers have all of their data and materials hidden away. They present a bunch of graphs and a proclamation that "this is truth", but if you ask to examine the data they used, or the code of any models used to analyse the data, you get told to fuck off. This is why the replication crisis is tearing the foundations out of entire fields of research: people refuse to share raw data. They might, if you force the issue, share a "pre-prepared" dataset that has been manipulated to hell and back, to make it conform to a priori assumptions and give the results they want. The entirety of climate science is riddled with this shit. Medical research is riddled with this shit. Anything related to psychology might as well be drug-addled, voodoo mumble rap, committed to paper, because its results are utterly divorced from reality and impossible to reproduce, or even to understand.
 
Without peer review in science based journals, you’ll have nothing but yes-men who will agree and control fake narratives that have nothing to do with scientific output. The CDC was the perfect example as to how money and politics can control objective conclusions as to how public health and discourse can be compromised.
 
Back