...In that Empire, the Art of Cartography attained such Perfection that the map of a single Province occupied the entirety of a City, and the map of the Empire, the entirety of a Province. In time, those Unconscionable Maps no longer satisfied, and the Cartographers Guilds struck a Map of the Empire whose size was that of the Empire, and which coincided point for point with it. The following Generations, who were not so fond of the Study of Cartography as their Forebears had been, saw that that vast Map was Useless, and not without some Pitilessness was it, that they delivered it up to the Inclemencies of Sun and Winters. In the Deserts of the West, still today, there are Tattered Ruins of that Map, inhabited by Animals and Beggars; in all the Land there is no other Relic of the Disciplines of Geography.
J. A. Suárez Miranda, Viajes de varones prudentes, Libro IV, Cap. XLV, Lérida, 16581
In our recent paper we modeled the role of models in science. Working on this slightly mad topic forced us to articulate for ourselves what the measure of a useful model really is.
Models are different things to different people. A model can be a system of equations describing mechanistic processes, a computer simulation, a summary of statistical relationships, a verbal description, or even a cartoon. Depending on the norms of a field and the scientific question at hand, we may prefer one type of model over another. But the purpose of a model is always to clarify our understanding of what we are studying. That sounds straightforward, but it’s not, because clarity of understanding can only be assessed in hindsight. A model must help us formulate testable hypotheses in a way that moves a field forward, perhaps by pointing us to a new idea that finds empirical support, by allowing us to throw out an old idea, or by making a more precise quantitative prediction.
The Natural Selection of Bad Science2 by Paul Smaldino and Richard McElreath has been influential in shaping the discussion about what causes bad scientific practice. Part of that paper is an agent-based model, which illustrates how an incentive structure that rewards novel publications can lead to the cultural evolution of “bad science” – i.e. a situation where many novel but false results are published, and scientists spend little effort figuring out what’s actually true. The idea of inevitable decline in the culture of science strikes a chord with many researchers, especially in fields grappling with a replication crisis. In their model Smaldino and McElreath allow researchers to expend effort (or not) when testing hypotheses, but the hypotheses themselves are chosen at random. This was striking to us, because much of the effort we expend in our own research goes into constructing mathematical models to generate the hypotheses we test. If effort can be expended to improve hypothesis selection, we thought, that might change the dynamics of cultural evolution. But it wasn’t clear exactly how this would play out.
Bad models aren’t wrong, they’re tautological. In our recent article3 we show that the mere ability to expend effort at selecting better hypotheses can prevent the cultural evolution of bad science. We predict a tipping point beyond which high effort and good scientific practice can be sustained by using models for hypothesis selection. This result was a surprise, because the existence of such a tipping point was not evident from the construction of our model. And so our analysis of cultural evolution instantiates the conclusion that we draw from it: models are beneficial when they add qualitatively to our understanding, by generating hypotheses.
Of course our model of scientific culture could be “wrong” in the sense that its assumptions may not usefully capture reality. While a wrong model would be a failure in hindsight, that would not necessarily make it “bad”. A bad model is one that is merely tautological. A tautological model may have value as a tool for communication, but it cannot help us select better hypotheses, and so it plays no role in the natural selection of good science.
All models are wrong, but some are useful. A model that is insufficiently plastic is likely to be useless, because its predictions are evident from its construction. This is an argument against overly simplistic “toy” models (although many seemingly simple models are not in the least tautological). A model that is made too complicated in pursuit of realism can be equally problematic, as Borges illustrates1. Modelers often quote George Box’s assertion that “all models are wrong, but some are useful”. All simplifications are inexact (i.e. wrong) and models are useful only if they are simplifications. And so developing a useful model presents a kind of goldilocks problem – it must be neither so simple nor so complicated that it becomes contentless. The goldilocks zone will shift, over time, as a field’s methodology and knowledge advance. The measure of a model is its ability to inform us which hypotheses are worth testing, compared to what we knew without the model.
1. Jorge Luis Borges. On Exactitude in Science. A Universal History of Infamy (1946). Translation from Collected Fictions (New York: Viking Penguin, 1998, p. 325) by Andrew Hurley.
2. Smaldino, P. E., and McElreath, R. 2016. The natural selection of bad science. Royal Society Open Science.
3. Stewart, A. J. and Plotkin, J. B. 2021. The natural selection of good science. Nature Human Behaviour.
Image: Drawing of an elephant in the style of a child. An inexact representation, still able to convey plenty of important information.
Image credit: Sheila Singhal.
Acknowledgment: Dan Graur, who inspired the image choice.