The popularity of microblogging service Twitter has spread across the globe, and the US government has taken notice. Twitter was one of the earliest means of communication for the Egyptian protestors, and could potentially be used to gain insight into what ordinary people in the Middle East and South Asia are thinking and feeling.
Of course, many people who live in these countries don’t tweet in English, so first you have to translate what they are saying.
The problem: Flesh-and-blood translators cost money, and there’s a shortage of qualified translators for certain languages. People don’t always or even primarily use Twitter to talk about politics or other weighty topics – there’s also a lot of mundane chatter on the network. There’s no point in having professionals spend their time translating what someone in Pakistan ate for breakfast this morning.
However, machine translation has its own problems, especially for languages like Arabic and Urdu, which haven’t yet been indexed thoroughly enough to provide good translations.
Funded by the Pentagon, University of Buffalo computer scientist Rohini Srihari is trying to build a program that will automatically translate tweets, separate out the important tweets and analyze the sentiment behind them. It’s quite an ambitious undertaking. She explained to NPR:
“What I want is to determine who are the people, places and things being talked about. Is there an opinion being expressed? Is it a positive or negative opinion being expressed? And when you are able to figure out what the topic of the conversation is, what kind of sentiment is being expressed around that, that’s the goal of what we are trying to do.”
Accurate sentiment analysis is difficult enough without adding in the hurdle of automatic translation.
Another obstacle is the fact that many tweets are written in “Urdish” and “Arabish,” versions of Urdu and Arabic used for electronic communications on Latin alphabet keyboards and keypads.
If she’s successful, however, her work won’t benefit the US government exclusively. History professor Ernest Tucker told NPR that the ability to record, compile and analyze relevant tweets will also help historians create a more democratic recounting of major events:
“That’s the goal of all historians anywhere, to try to get the voices of more and more people into the conversation, and anything that can do that, particularly this kind of thing, is a wonderful gift.”