In the Grupo de Sistemas Complejos of Universidad Politécnica de Madrid we have been interested in studying political polarization from a quantitative point of view for quite a long time. We had already explored that phenomenon in Venezuela and in Chile and we are always looking for interesting contexts to analyse. That is why, when the regional government of Catalonia announced the celebration of a referendum on independence with great opposition by the Spanish central government, we started preparing to collect Twitter data and designing a research plan.
We could have adopted the same approach that we used in previous works and just look at the ideological leaning of the people, but the Catalan context has some unique features that demanded a new perspective. In particular, the nature of the conflict is mainly territorial (there have been episodes of secessionism in Catalonia since the XVII century) and language plays a central role in defining the opposing groups. This is a common trait of territorial conflicts in many different places around the world. Some examples can be found in Canada, Belgium, Cameroon or India. Considering all of the above, we thought that if we focused on analysing the users’ use of language we could get some interesting insights. We studied the use of Catalan and Spanish by Twitter users during 50 days around the referendum day, but we did not see any correlation with the off-line political events that were taking place nor did we find any other relevant information.
Then, we decided to try to infer the opinion of the users according to who they retweet as we had done in earlier projects (we do not look at the content of the tweets-just the interactions between users). The opinion distribution that we obtained from this analysis is much more interesting, because it not only shows the two confronted groups that we expected to find (pro and against independence), but also a third pole in an intermediate position that emerged spontaneously from the application of our opinion inference methodology to the retweet network. This third pole can be explained by the existence of political parties that are not completely aligned to either side of the conflict and that instead look for a middle ground.
At that point we thought that with the analysis of the ideological spectrum we already had enough results to write a paper of substance, but we wondered what we should do with the language use analysis. Were we supposed to throw those results away? Finally, we realised that we actually had one index for use of language and another index for opinion, so we could combine both to reveal the relationship between them.
And so we found out that users that are against independence tend to speak mainly Spanish; in fact, they practically do not speak Catalan at all. This was very intriguing for us because in Catalonia people tend to be bilingual, so it is reasonable to think that there are people out there that can speak Catalan and are against independence. If those people exist, do they choose to not speak Catalan when they are talking about this controversial topic? Or maybe what it really happens is that almost every person that can speak Catalan is in favour of the independence. The causal direction here is all but clear for us (are Catalan speakers secessionists, or do unionist avoid speaking Spanish?), but the result is nevertheless fascinating. As a contrast, we found that users that are closer to the secessionist pole show a very wide range of language use.
third intermediate pole was also clearly distinguishable in the
opinion-language use map. It sits in a
middle/pro independence position in terms of ideological leaning but
its users show a clear preference for Spanish, a behaviour that is
closer to the unionist pole.
The combination of the opinion inference methodology that we propose in this work with other sociological dimensions besides language (for example, age, income or gender) is still unexplored, and given the richness of the patterns that we have found here we think that it has huge potential.