In their recent work, IJzerman et al. argue that social and behavioural science are not mature enough to offer sound solutions for policy makers. They introduce a NASA-inspired classification of the strength of evidence, claiming behavioural science is at the very bottom of the reliability ladder. Here, we defend the view that despite suffering from many problems, behavioural science has a lot to offer to both the public and policy makers. We argue that contrary to rockets that can but do not have to be launched, some behavioural interventions in the crisis have to be immediately implemented, and it is better to base these interventions on scientific evidence rather than rely on lay intuitions.
Keywords: Social Science, Behavioural Science, Evidence-Based Interventions, Evidence Readiness Levels, Replicability.
It’s going to happen anyway
IJzerman et al. correctly noted that the evidence provided by behavioural science is nowhere close to the robustness of rocket science. We agree, but we find the comparison of behavioural science to rocket science misleading. We highlight critical differences between the two sciences, showing why the analogy is doomed to failure. First, launching a rocket requires absolute precision whereas behavioural interventions, such as getting people to wear face-masks, often requires a small increase in the prevalence to make a huge impact on mortality. Second, rocket launching is zero-one game, rocket flies or not. Behavioural interventions have an entire spectrum of outcomes, from completely harmful to completely beneficial, with all intermediate states. Hence, suboptimal solutions can work in behavioural science (e.g., will affect some people), but would be disastrous in rocket science. Finally, when crisis happens, we cannot postpone informing the public about the threat or how to best mitigate it. Consider a very needed COVID-19 vaccine. It must be developed with highest rigor, work efficiently on majority of people, not have large negative consequences. Any vaccine not meeting these criteria should not be accepted. This standard of evidence IJzerman and colleagues set as target for science. This is achievable because with no vaccine available, people will not inject random products they intuitively think will work, providing scientists with much needed time to science their way to the vaccine. However, once a vaccine is developed, a policy must be implemented to facilitate its intake. Vaccination can be made mandatory, voluntary, or can be the default solution unless opted-out, but some method must be implemented. And this selection cannot be delayed until best method has been found, because people will discuss the vaccination problem anyways. This is what a crisis intervention is about: requirement for an immediate action. This is where IJzerman and colleagues’ expectations fail to face the real-world; this is where expecting (sometimes unreachable) perfect evidence is too stringent.
Interventions may not work, but will rather cause no harm
behavioural scientific evidence is weak, and in many cases we cannot even agree on the existence of some phenomena[3–5]; observed effects are limited to a narrow set of stimuli and often do not generalize; reported effect sizes of experimental effects are inflated. All of that is an obstacle in building theories explaining human behaviour. IJzerman et al. seem to imply that these problems preclude a direct application of psychological findings into practice. Although it indeed seems bad to use weakly supported evidence to inform large-scale interventions, the problem with such thinking is that it imposes wrong reference point. When evaluating a crisis intervention, we should not be thinking whether an intervention worked, but instead we should think how well it worked compared to an alternative intervention. In other words, if an intervention failed, there is no opportunity cost if the alternative intervention was equally (in)efficient. Most of the failed replications in behavioural science show null effects, but almost none show the opposite pattern to what was originally reported. To illustrate, there is a 25:1 ratio of significant replications in the same vs. in the opposite direction in a corpse of 194 replications of psychological research. So, for example, a failed intervention aimed at reducing fake news spread is highly unlikely to increase its spread. This is something completely different to a space shuttle that does not fly. Finally, even in the unlikely case a science-informed intervention backfires, intuition-based interventions are at least just as likely to cause harm. For example, although intuitively compelling, fighting fake news by fact-checkers backfires, because fake news not yet marked as fake can gain credibility. Hence, it is wise to implement crisis interventions based even on insufficient evidence.
What is crucial to our reasoning here is that people will direct ‘psychological’ actions and interventions against crisis even if they are told that psychology is not crisis-ready. This differentiates behavioural from rocket science, but also from medicine or vaccinology. Undoubtedly, it is disastrous when an ineffective or harmful medicine or vaccine is released. But, while policy-makers cannot recommend medical measures based on their intuition, they likely base social policies on them. If at least some of the latter interventions are accurate, implementing them will make a small positive change, what, at a societal level, may translate into huge gains. If the conclusions are wrong, likely nothing bad will happen compared to when implementing alternative interventions.
How much better can (behavioural) science be?
The replicability crisis, widely recognized in psychology, is also present in other scientific fields. For instance, only six out of 53 of milestone studies on cancer and six out of 18 experimental studies from top tier economic journals could be replicated. The replicability problem extends to other fields too, e.g., to physics, medicine or biology. Reason for this are honest and dishonest errors and the external incentives to publish surprising and novel results. Science is a process, and scientists are mostly wrong most of the time. Counterintuitively, this is a good thing, because being wrong in the past suggests science progressed toward the truth. Then, fully appreciating that science is likely wrong in answering many questions, we cannot refrain from engaging in the public discourse until we reach the ‘final’ confidence in applied interventions. Blank spots of scientific knowledge are likely to be replaced with questionable pseudoscience and conspiracy theories. For example, if we fail to evidence-based-speculate on the origins of COVID-19, people can generate some irrational explanations and blame, e.g., the 5G technology.
We do appreciate the proposal to improve scientific pursuit of sound knowledge by using Evidence Readiness Levels that provides a benchmark of how far scientist are from understanding human behaviour. What strikes us most as an accurate diagnosis of the field, is that behavioural science often fails to reach even Level 1, “basic principles observed and reported”. We can and should do better by simply appreciating basic research and replications, building solid fundamentals for subsequent development of theories and their applications. We should also seek for cross-cultural validation of behavioural findings beyond western, educated, industrialized and democratic samples, and by building international research consortia, such as the Psychological Science Accelerator, or inviting diverse teams to develop their own ways to tests new hypotheses. All these things happen right now, making us optimistic about the future of behavioural science.
To sum up, social and behavioural sciences are far from being flawless, but the knowledge we currently have is better than no knowledge at all. Caution is, indeed, much needed, and jumping straight to conclusions from correlational studies or singular experiments is, indeed, worth damning. However, we believe that we are on the right path by accelerating the progress on open research practices (e.g., pre-registrations, publicly available dataset, international collaboration). As scientists, we have a moral obligation not only to seek the knowledge, but also to be as useful to people as possible. And neither silence, nor ignoring evidence until it reaches the highest degree of certainty seem to be a very useful contribution.
Author Note: Michał Białek and Marta Kowal contributed equally and share first authorship. Order was determined upon mutual agreement. The preparation of this work was funded by the National Science Centre, Poland (NCN) under Grant SONATA 2017/26/D/HS6/01159 assigned to Michał Białek.
Conflicts of Interest Statement: No conflict to declare
Author Contributions Statement: MB conceptualized the project, MK drafted, and MB and AG-B critically revised the manuscript. All authors approved the submitted version.
Acknowledgments: Thank you to Hans IJzerman, Neil Lewis, Jay van Bavel and Gordon Pennycook for helpful comments on the draft of this manuscript. We also thank members of Journal Club Zięby Darwina at the University of Wroclaw (especially the moderator – Michał Misiak) for fruitful discussion.
Data availability statement: Data sharing not applicable to this article as no datasets were generated or analysed during the current study.
- IJzerman H, Lewis NA, Weinstein N, DeBruine LM, Ritchie SJ, Vazire S, Forscher P, Morey RD, Ivory J, Anvari F, et al.: Is Social and Behavioural Science Evidence Ready for Application and Dissemination? 2020, doi:10.31234/osf.io/whds4.
- Bavel JJV, Baicker K, Boggio PS, Capraro V, Cichocka A, Cikara M, Crockett MJ, Crum AJ, Douglas KM, Druckman JN, et al.: Using social and behavioural science to support COVID-19 pandemic response. Nat Hum Behav 2020, 4:460–471.
- Camerer CF, Dreber A, Forsell E, Ho TH, Huber J, Johannesson M, Kirchler M, Almenberg J, Altmejd A, Chan T, et al.: Evaluating replicability of laboratory experiments in economics. Science (80- ) 2016, 351:1433–1436.
- Ioannidis JPA: Why Most Published Research Findings Are False. PLoS Med 2005, 2:e124.
- Collaboration OS: Estimating the reproducibility of psychological science. Science (80- ) 2015, 349:aac4716–aac4716.
- Yarkoni T: The Generalizability Crisis. 2019, doi:10.31234/osf.io/jqw35.
- Ioannidis JPA: Why Most Discovered True Associations Are Inflated. Epidemiology 2008, 19:640–648.
- Oberauer K, Lewandowsky S: Addressing the theory crisis in psychology. Psychon Bull Rev 2019, 26:1596–1618.
- Reinero DA, Wills JA, J. BW, Mende-Siedlecki P, T. CJ, J. VBJ: Is the political slant of psychology research related to scientific replicability? 2019.
- Pennycook G, Bear A, Collins ET, Rand DG: The Implied Truth Effect: Attaching Warnings to a Subset of Fake News Headlines Increases Perceived Accuracy of Headlines Without Warnings. Manage Sci 2020, doi:10.1287/mnsc.2019.3478.
- Begley CG, Ellis LM: Raise standards for preclinical cancer research: C. Glenn Begley and Lee M. Ellis propose how methods, publications and incentives must change if patients are to benefit. Nature 2012, 483:531–534.
- Schroeder MJ: Crisis in science: In search for new theoretical foundations. Prog Biophys Mol Biol 2013, 113:25–32.
- Stupple A, Singerman D, Celi LA: The reproducibility crisis in the age of digital medicine. npj Digit Med 2019, 2:1–3.
- Chan KMA: Value and advocacy in conservation biology: Crisis discipline or discipline in crisis? Conserv Biol 2008, 22:1–3.
- Shrout PE, Rodgers JL: Psychology, Science, and Knowledge Construction: Broadening Perspectives from the Replication Crisis. Annu Rev Psychol 2018, 69:487–510.
- Firestein S: Failure: Why Science is so Successful. 2015,
- Kofta M, Soral W, Bilewicz M: What Breeds Conspiracy Antisemitism? The Role of Political Uncontrollability and Uncertainty in the Belief in Jewish Conspiracy. J Pers Soc Psychol 2020, doi:10.1037/pspa0000183.
- Destiny T: Conspiracy theories about 5G networks have skyrocketed since COVID-19. Conversat 2020,
- Zwaan RA, Etz A, Lucas RE, Donnellan MB: Making Replication Mainstream. Behav Brain Sci 2017, 41:1–50.
- Henrich J, Heine SJ, Norenzayan A: Beyond WEIRD: Towards a broad-based behavioral science. Behav Brain Sci 2010, 33:111–135.
- Moshontz H, Campbell L, Ebersole CR, IJzerman H, Urry HL, Forscher PS, Grahe JE, McCarthy RJ, Musser ED, Antfolk J, et al.: The Psychological Science Accelerator: Advancing Psychology Through a Distributed Collaborative Network. Adv Methods Pract Psychol Sci 2018, 1:501–515.
- Landy JF, Jia ML, Ding IL, Viganola D, Tierney W, Dreber A, Johannesson M, Pfeiffer T, Ebersole CR, Gronau QF, et al.: Crowdsourcing Hypothesis Tests: Making Transparent How Design Choices Shape Research Results. Psychol Bull 2020, doi:10.1037/bul0000220.
- Vazire S: Implications of the Credibility Revolution for Productivity, Creativity, and Progress. Perspect Psychol Sci 2018, 13:411–417.