I have ESP on my mind lately (see this post and this post). A potentially controversial paper on ESP authored by Daryl Bem and colleagues, which is forthcoming in the Journal of Personality and Social Psychology, got me wondering about an important questions about the peer-review process:
In spite of the lofty standards of top academic journals, could a paper be published that the editor, reviewers, and authors all believed was wrong?
First, for those of you who are not familiar with it, here is the essence of the peer review process:
Image used with permission from www.UnderstandingScience.org and the University of California Museum of Paleontology.
My question of whether a traditional peer-review process would be unable to reject a paper that all parties believed was false was prompted also by a colleague’s email:
Let’s say some nobody had submitted this paper and the editor hears that it took 8 years to get these effects. At that point I would ask: 8 years? why?
– Does it mean that there is a desk drawer full of failed studies that weren’t reported? Show me that data
– Did the person make some bad choices on how data was dropped? I would ask for a report of all data that was collected
– Did he just keep running these experiments enough times until you got a subset that worked (and then stopped running studies)?
Putting aside the ethical considerations for a moment, suppose that Bem and colleagues wanted to highlight the problem with the reliance on null hypothesis testing in the publication process (some of which are detailed here in a NYT article), and to do so, they submitted a paper that they knew was false but met common standards of the peer review process. Would the paper still be published?
(NOTE: I am using this as a thought experiment and not suggesting that this is the case.)
Assume that the paper is competently executed with regard to theoretical development and design of the studies. How could the paper be rejected?
– The reviewers and editors could believe that the effects are simply due to chance, and they could ask for the number of “failed” studies. (Or, they could ask the authors to conduct a Bayesian analysis with a strong prior against rejecting the null.) Problem: This is a non-standard practice and presents an uncommon set of constraints that other submissions are not subject to.
– The reviewers and editors could say that the effect is too small to be of practical significance. Problem: the literature is filled with studies that present small effects but authors, reviewers, and editors believe inform important theoretical questions about human behavior.
– The reviewers and editors could say that they don’t believe the results. Problem: Cutting edge scientific developments often go against intuition and previous research.
Of course, publication is not the final step in the development of new scientific knowledge. If a phenomenon is not meaningful or robust, subsequent research will fail to replicate it or will not cite the original research (and citations are an important measure of influence). Nonetheless, failures to replicate are rarely published and the citation process takes years to accomplish, and so publishing false papers wastes a lot of people’s time.