Is there a crisis of reproducibility in science?

A recent article in the Proceedings of the National Academy of Sciences by Redish et al. argues that ‘reproducibility failures are essential to scientific inquiry.’ The authors remind readers that progress in many fields often proceeds haltingly, with successes, retrenchments, reconsiderations, and revisions. As examples, Redish et al. summarize the history of the Four Color Theorem, the claim that Fourier series can characterize any function, and the conditions under which neural networks can carry out certain computations. I find useful these reminders from mathematics and computer science that progress and regress are essential parts of the trajectory of scientific discovery, but I find unconvincing arguments that ‘many of the current concerns about reproducibility overlook the dynamic, iterative nature of the process of discovery.’

According to a 2016 survey in Nature, some 90% of scientists surveyed agreed that ‘there is a reproducibility crisis.’ Sixty to eighty percent of scientists in fields from chemistry to earth and environmental sciences reported failing to reproduce someone else’s experiment, and 40-60% reported failing to reproduce their own experiments. Respondents expressed widely-ranging opinions about ‘how much published work in your field is reproducible?’ Curiously, physicists and chemists were most confident in the literature AND most likely to report having failed to reproduce a study. Perhaps chemists and physicists are more likely to attempt to reproduce published results than scientists in other fields.

Redish et al. rightly point out that no single study should be viewed as definitive, and that discovering robust findings takes time. But I think the authors miss an opportunity to talk about what specific practices accelerate the process of discovery and what practices retard it. Open sharing of results, positive and negative, data, experimental procedures, analysis code, and materials accelerates progress. In the fields of psychology, vision science, and neuroscience I am most familiar with, failures to replicate are almost impossible to publish. If the failures aren’t published, discovering the reasons for the failure will be difficult to discern. Open data, materials, analysis code, and procedure sharing are slowly growing in popularity, but still not commonplace. Without ready access to these elements, it is almost certain that attempts to replicate will differ from the original finding in ways that may or may not be readily apparent.

Redish et al. state:

A failure to reproduce is only the first step in scientific inquiry. In many ways, how science responds to these failures is what determines whether it succeeds.

I agree. And the response of many open science advocates is to encourage our colleagues to adopt more transparent practices in reporting and sharing results, data, analyses, and materials, and to build tools for making this easy. If we want to build ‘a genuinely cumulative science’, to borrow a phrase from Walter Mischel, we need to attend to improving the processes of self-correction that Redish et al. rightly point out are endemic to science and essential to its power. Right now, it’s far too difficult for researchers to learn about failures to replicate or generalize and far too difficult to explore the reasons why. So perhaps the crisis of reproducibility is more usefully thought of as a crisis of transparency, and if this is so, there are ready solutions at hand.