Hunters and busybodies: Quantifying ancient archetypes of curiosity

A behind-the-paper look at our manuscript quantifying between-person differences in information seeking on Wikipedia.
Hunters and busybodies: Quantifying ancient archetypes of curiosity

Annie Dillard nicely captures one of the main motivations for studying curiosity in our paper out in Nature Human Behavior: “How we spend our days is, of course, how we spend our lives. What we do with this hour, and that one, is what we are doing.” Filling up the hours of our days, we attend school, we go to work, we eat, and we sleep. But in addition to the time we spend meeting our obligations at work, school, and elsewhere, there are moments when we seek out interactions with the world that are not prompted by external obligations. Instead, we are free to act on our own curiosity, and to pursue the almost limitless range of activities available in this world. For one author on this manuscript, those moments are filled with reading about unusual words that have all but disappeared from common usage. For another author, these moments are often filled with listening to interviews of artists to learn the process behind the creation of artworks. Even from this small sampling of how people choose to spend their free time, the idiosyncratic ways that people direct their intrinsically-motivated information-seeking when curious is clear. These complex, between-person differences in curious practice and the unique resources that are collected during this practice might seem inconsequential given their seemingly pedestrian, everyday nature. Yet, these resources accumulate over time, are the content we carry with us as we go through life, and become the grist for the mill of our everyday conversations with others.

The importance of curious practice, as well as the difficulty of capturing it, is what led us to study curiosity in daily life. To begin, we required a framework capable of capturing distinct styles of curious practice. This was achieved by turning to recent historicophilosophical work that has traced the use of the word curiosity (and its non-English counterparts) across the last two millennia. That work identified a few consistent styles of curious information seeking that span times, cultures, and languages. These styles include that of the busybody and that of the hunter. The information seeking of the busybody is marked by a preference for sampling diverse concepts. The information seeking of the hunter, by contrast, is characterized by sampling closely connected concepts. One can imagine that consistently tending to practice one style over the other would lead to the accumulation of very different types of informational resources over time. The busybody’s store of information would be more diverse relative to that of the hunter, and the hunter’s information store would contain greater depth on fewer subjects.

But how can a scientist measure or model this intuition of differences between diversity versus depth in information stores? For this, we turned to network science. Our choice of network science, which focuses on the strength of interconnections between units, was a natural fit to the notion that some networks (i.e., those of the hunter) would contain concepts more strongly related to one another than other networks (i.e., those of the busybody). Another motivation for turning to network science was the availability of the mathematical language of graph theory that could readily capture general notions of tight versus loose networks of informational resources.

With these conceptual and analytic frameworks in hand, we next needed a data collection paradigm that would lend itself to people creating networks of informational resources while seeking out information with minimal external constraints. A useful data collection paradigm would accentuate between-person differences in curious practice. This goal led us to turn to Wikipedia, an online encyclopedia that is often visited when people engage in intrinsically-motivated learning. We asked participants to spend 15 minutes on Wikipedia each evening for 21 days, reading about whatever they wanted. Through this process, we ended up with up to about 5 hours of Wikipedia browsing for each participant.

Hunters and busybodies. Participants in our study explored Wikipedia for 15 minutes every day for 21 days. We represent participants' information seeking as knowledge networks. Nodes represent the unique Wikipedia pages visited and edges represent the similarity between the text content of each page. We use a historicophilosophical taxonomy of curious information seeking to examine between-person differences in the resulting networks. The busybody samples diverse concepts and creates loose knowledge networks of sparsely connected concepts. In contrast, the hunter creates tight knowledge networks characterized by sampling related concepts. We operationalize notions of network tightness using graph theoretical indices. Intuitively, the characteristic path length assesses the average distance between all pairs of nodes in a network. When path length is short, the network is easily traversed and representative of the hunter's tight networks. The clustering coefficient indicates the extent to which a node's neighbors are connected. A high average clustering coefficient indicates a tight network of closely connected concepts, which is the kind we expect of the hunter.

Upon looking at the Wikipedia data, we found an immense richness in people’s information-seeking. We constructed networks for each individual, consisting of all Wikipedia pages they visited as well as the strength of similarity between page pairs based on the similarity of the text across the two pages. We then applied common graph theory metrics to capture the tightness of each person’s network. In examining distributions of the graph metrics for the entire sample, as well as looking at the actual data of how participants traveled through Wikipedia, our method seemed to work very nicely. Participants with values indicating loose networks visited diverse, seemingly random concepts. For example, one ordered sequence of pages on the loose side of the distribution included the following pages: “Physical Chemistry”, “Me Too Movement”, “The Partridge Family”, “Harborne Primary School”, “HIP 79431”, and “Tom Bigelow”. On the other hand, a sequence from a participant with values indicating tight networks visited many pages on the topic of Jewish history: “History of the Jews in Germany”, “Hep-Hep Riots”, “Zionism”, “Nathan Birnbaum”, and “Theodor Herzl”. The most striking example for us was a clear outlier, many standard deviations away from the sample mean indicating a very tight network of highly related concepts. This participant was a perfect example that our method was working. Almost every page visited by this participant was a page devoted to a member of the British Royal Family. The only exception was a page about a naval ship named “Queen Elizabeth 2.”

We additionally examined the extent to which the patterns of browsing data represented in networks were related to curiosity as it is typically operationalized in psychology and other social sciences. We did this by collecting information about people’s deprivation sensitivity when they first visited the laboratory. People high in deprivation sensitivity tend to seek information in order to escape the tension of not knowing something. Deprivation sensitivity is expressed as a persistent and effortful form of specific exploration and, as you might anticipate, participants high in deprivation sensitivity had tighter networks relative to participants with lower deprivation sensitivity scores.

Now that we have a way to collect and quantify between-person differences in how people seek out information under minimal external constraints, our work is really just beginning. We are now eager to expand our focus beyond Wikipedia to the vastness of the internet in order to capture naturalistic explorations of new media content while people commute, take their lunch breaks, and browse their smartphones right before they fall asleep. How will these complex data lead us to expand the conceptual and analytic framework of hunters and busybodies, and the network notions of tight and loose structure, to capture what will likely be even more marked between-person differences in curious practice? Further, how does the customization of our media content diet by behind-the-scenes algorithms designed to activate our buying impulses affect our agency as the captains of our new media curiosity ships? We are curious to find out!