With more than 300 languages spoken, London is truly a diverse city. But what does that diversity look like on a map? To find out, two researchers from University College London used Twitter to visualize the city’s many languages via its tweets. You can see a screenshot of the result above, but I recommend clicking through to the original version on researcher James Chesire’s blog for a more detailed version.
Diversity aside, 92.5 percent of the tweets on the map are in English. Other common languages (in order of prominence) were Spanish, French, Turkish, Arabic and Portuguese.
After collecting the tweets, the researchers applied an algorithm derived from the one Google Chrome uses to identify the languages of the websites you encounter while surfing the web. The algorithm made short work of sorting the tweets by language. For some reason, though, it misclassified English-language with repetitive acronyms like “lololol” and “hahahaha” as belonging to the Philippine language of Tagalog. So, all “Tagalog” tweets had to be discarded.
Once that was done, the researchers were left with a set of geolocated tweets in 66 different languages. Color-coded and placed on a map, they create what James Chesire called a “paint-speckled effect” that showcases London’s linguistic diversity.
However, as researcher Ed Manley noted on his blog, London is actually even more diverse than the map indicates:
“In total, 92.5% of tweets are detected as English, far above existing estimations (60%) of English speakers in London. While languages you’d expect to score highly – such as Bengali and Somali – barely feature at all. Either people only tweet in English, or usage of Twitter varies significantly among language groups in London.”
Of course, a city’s Twitter users are by no means a representative sample of the population. On his blog, Chesire explains that the tweets they were able to map represent an even more selective data set, as “they only include people who have a good location (through GPS) and those who are connected to the internet.”
Even with those limitations in mind, the map is still quite fascinating. What do you think of it?