Social relationships are probably the most important things we have in our life. They help us to get new jobs, live longer, and be happier. At the scale of cities, networks of diverse social connections determine the economic prospects of a population. The strength of social ties is believed one of the key factors that regulate these outcomes. According to Granovetter’s classic theory about tie strength, information flows through social ties of two strengths: weak ties that are used infrequently but bridge distant groups that tend to posses diverse knowledge; and strong ties, that are used frequently, knit communities together, and provide dependable sources of support.
For decades, tie strength has been quantified using the frequency of interaction. Yet, frequency does not reflect Granovetter's initial conception of strength, which in his view is a multidimensional concept, such as the "combination of the amount of time, the emotional intensity, intimacy, and services which characterize the tie." Frequency of interaction is traditionally used as a proxy for more complex social processes mostly because it is relatively easy to measure (e.g., the number of calls in phone records). But what if we had a way to measure these social processes directly?
We used advanced techniques in Natural Language Processing (NLP) to quantify whether the text of a message conveys knowledge (whether the message provides information about a specific domain) or support (expressions of emotional or practical help), and applied it to a large conversation network from Reddit composed by 630K users resident in the United States, linked by 12.8M ties. Our hypothesis was that the resulting knowledge and support networks would fare better in predicting social outcomes than a traditional social network weighted by interaction frequency. In particular, borrowing a classic experimental setup, we tested whether the diversity of social connections of Reddit users resident in a specific US state would correlate with the economic opportunities in that state (estimated with GDP per capita).
We found that the combination of diversity calculated on the knowledge and support networks correlates much more strongly with GDP than diversity calculated on a network weighted with interaction frequency (R2=0.62 vs. R2=0.30).
In contrast with assumptions frequently adopted in network science research, we found that pairs of Reddit users who exchange knowledge have a typical frequency of interaction that is indistinguishable from the frequency between those who exchange support. The two types of ties differ instead in their geographical span. Knowledge ties are more likely to be long-distance (i.e., connecting people living in far-away states), whereas support ties are mostly found locally, between people living close by (see Figure). In agreement with Granovetter’s theory, we found that economic development at the level of US states is associated with the abundance of global (highly diverse) ties that carry factual knowledge and with the abundance of local (not highly diverse) ties providing social support.
Our results confirm once more the power that modern NLP tools have to quantify complex social and psychological signals from conversations (in recent work, we showed how social dimensions extracted from text can predict opinion change). When applied to the study of social networks, these tools give us an unprecedented ability to decompose relationship data into interpretable social constituents, thus opening up ample avenues of exploration in social network analysis and Computational Social Science.