Think Google Translate can handle all of your translation needs? Think again! There are around 3,570 written languages in the world. Google can only translate 103 of them. What’s missing? Popular languages with millions of speakers.
The gaps in Google Translate’s coverage of the world are most glaring in Africa, Asia and South America. Here are 8 surprising languages that Google can’t translate.
With around 60 million native speakers, Cantonese is the official language of Hong Kong and Macau. It is the 24th most commonly spoken language in the world. It has more native speakers than Dutch, Swedish and Greek put together. And it’s not included in Google Translate. At the moment, Google only supports Mandarin Chinese, though that will likely change in the future.
Odia or Oriya
- Has 33 million native speakers
- Is an official language of India and the Indian states of Odisha and Jharkhand
- is designated as a “Classical Language” in India AND
- is not covered by Google Translate?
The answer is Odia, also known as Oriya. This is another language that the Google Translate team is working on. It hasn’t been a high priority because “The online presence of Odia is quite insignificant,” as Subhashish Panigrahi, programme officer at Centre for Internet and Society, explained to the Telegraph of India.
Bhojpuri is spoken in India, Nepa, Guyana, Fiji, Mauritius and Suriname. It has approximately 40 million native speakers. However, many Bhojpuri speakers lack internet access. But considering India is expected to have 500 million Internet users by next year, Google had better get on the ball.
Maithili is an official language in India. It’s also one of the most commonly spoken, with 30 million native speakers. Additionally, it is the second most common language in neighboring Nepal, where it has official status under the Interim Constitution.
Looking toward the future, significant numbers of Maithili speakers will come online in the next fews, along with the rest of India.
Next, we turn to Africa, another emerging market that is underserved by Google Translate at the moment. With 38 million native speakers, Oromo is one of the most widely spoken languages on the continent. It is spoken by the Oromo people in Ethiopia and Kenya, as well as in other African countries like Somalia, Tanzania, South Africa, Libya and Eritrea.
One of the major problems with Oromo when it comes to machine translation is that it’s actually a dialect continuum. People on one side of the continuum can’t necessarily understand people on the opposite side, even though they are technically speaking the same language.
Currently, Ethiopia has an Internet penetration rate of between 1.9-3.7%, depending on the source. However, efforts are in progress to get the country online, and the number of Ethiopians with web access is way up from .4% in 2008.
Another African language, Fula or Fulani is spoken across West and Central Africa. It has approximately 24 million native speakers, mostly from the Fulani people, and is spoken as a second language by other regional tribes.
Fula is an official language in Senegal and Nigeria, and a national language in Mali and Niger.
When you think of South America, what language do you think of? Probably Spanish, perhaps Portuguese. But many South Americans are more comfortable speaking the indigenous languages they grew up with. With 8.9 million native speakers, Quechua, the language of the Incas, is the most widely spoken indigenous language in the Americas. It has official status in Peru, Bolivia and Ecuador.
Like Oromo, Quechua has many different dialects and not all of them are mutually intelligible (it’s sometimes listed as a language family instead of a language.) That’s challenging, but including even the most commonly spoken dialects would be a boost for indigenous people across South America. While Quechua is not currently endangered, there is a trend of Quechua speakers switching to Spanish because they feel it offers them more opportunities and social status. The ability to access services (like the Internet) in a given language is an important part of language preservation.
More than 6 million people in Mesoamerica and Central America speak a Mayan language as their first language. K’iche’, the most widely spoken Mayan language, has an estimated 2.3 million native speakers, mostly in Guatemala. Currently, no Mayan languages are available in Google Translate.
Why not offer K’iche’? After all, Google Translate is available in Frisian, with only 480,000 native speakers. Part of the issue is that the Mayan languages are mostly oral. Machine Translation works from written content,
Many of these languages are in emerging economies where Internet connectivity is not something you can take for granted. But various initiatives are in place to help bring these parts of the world online. As more and more people get connected, demand for previously overlooked languages will increase.
What languages do you think Google Translate should add next, and why? Let us know in the comments!
Photo credits: By LiliCharlie – Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=38770607, CC BY 2.0, https://commons.wikimedia.org/w/index.php?curid=218481, By Huhsunqu – self-made, from LocationPeru.svg, CC BY 3.0, https://commons.wikimedia.org/w/index.php?curid=3039360