Robust Replication Science: One in the hand or two (dozen) in the bush?
Kai Ruggeri & Sander van der Linden | The choice to go wide or deep in replications is no small (effect) matter. Using our recent work, we explain our position and put a marker down for what we think should come next.
Our motivation in attempting to replicate the original 1979 paper on Prospect Theory by Daniel Kahneman and Amos Tversky was focused on two things. The first was recognizing that our own work heavily relied on assumptions from the original study, its subsequent revisions, and many of the applications from that work that we all took as given. The second was to verify that a study of such magnitude held up to current standards in terms of sample, power, and replicability.
There was no deliberation over the decision to focus on a single study rather than to conduct multiple experiments (from related topics or otherwise), so we do not claim to have chosen it over an a la carte approach. The only content discussions involved whether using all of the 20 items from 1979 was necessary (we decided on 17), whether to include any new items that might fill gaps (we decided against that on grounds of integrity and fidelity), and whether any additional measures would help with generalization or links to other work (we decided only on standard socioeconomic indicators). Yet the choice in replication approaches between singular focus and a la carte has now become a topic of debate, and our experience in the process is therefore highly relevant.
It is a great sign that the debate is now about how to best conduct replications rather than if replicability is worth testing. However, we share the concerns from many corners of these debates: low-quality or low-investment replications, flagrant reactions from the original authors* or the replication team, opaque processes in journal policy and peer review for attempts at replication, and so on. Successful replications should not become the new failure-to-reject-the-null manuscripts, and our data show that robust re-testing is critical. Replication studies should offer value as part of fostering a cumulative behavioral science, not just repetition for the sake of repetition or as a “gotcha” approach. They also must be clear on what value there is in the possible outcomes such as successful, failed, and partial replications.
Contrary to the usual formula, we did not divide our attention over fundamentally different findings in psychology, each with their own sets of theories, experimental stimuli, and predictions. Those a la carte approaches offer value in bringing attention to replicability and sampling threshold standards, yet with understandable pushback. Most critically, they highlight the value of generating multiple entirely independent pools of data, which illustrate the variability in results expected for a single intervention.
In our case, each individual measure from the same instrument, same discipline, and same template required a major resource commitment – multiplied by the number of researchers involved such that it at least reasonably reflected the investment by the authors of the original study. All labor and resources were pooled into one single replication done the right way: high-powered, cross-cultural, direct, and true to the original study. We felt it deserved such focus, particularly as the theory has been widely generalized and not assumed to be fundamentally contextual. Although there was some attenuation, the reproducibility of main tenets of Prospect Theory is astounding, perhaps even a reason for celebration.
Tests of replication are important regardless of the outcome. Had some subset of items failed to produce similar results as the original, we would have focused on where the theory may have had weaknesses. Yet we constantly hear about the unreliability of our science. But not in this case: our large-scale replication points to areas where our science is highly robust, replicable, and stable over time and place. We are glad Nature Human Behaviour is leading the way in publishing replications – failed and successful. Both are valuable to our science and newsworthy.
*We want to state clearly that we did not receive any such negativity in our case. In fact, we received only positive (indirect) feedback from Prof Kahneman and a very encouraging response from Prof Hugo Sonnenschein (editor of Econometrica in 1979).