Posts

Google Plots Universal Translator

Google Plots Universal Translator

Google is known for taking on big, ambitious projects. So, it should come as no surprise that their latest “mo0nshot” comes right out of science fiction: a “universal translator” that translates between any two languages, instantaneously.

Der Spiegel recently profiled Franz Josef Och, the computer scientist behind Google Translate. In the interview, he described his ambition to create a universal translator that seamless integrates itself into the users’ conversations, without the need to press buttons.

As Och explained to Der Spiegel, Google Translate does not require the in-depth knowledge of languages and cultures that the translation industry takes for granted:

“I have trouble learning languages, and that’s precisely the beauty of machine translation: The most important thing is to be good at math and statistics, and to be able to program…So what the system is basically doing (is) correlating existing translations and learning more or less on its own how to do that with billions and billions of words of text. In the end, we compute probabilities of translation.” Read more

The Google Code

Last week, the Internet lit up with the news that a “secret code” had been discovered in Google Translate. But was it really a secret message, or just another bad translation?

Much of the time, Google Translate will provide an imperfect but serviceable translation. However, sometimes it comes up with automatically generated translations that are so bad, they seem uncanny.

The story of the “secret code” was originally published on the Krebs On Security blog.  A few months back, researchers from a couple of different security firms approached computer security reporter Bryan Krebs with an intriguing discovery: putting the traditional “Lorem Ipsum” placeholder text into Google Translate yielded some very strange, politically tinged results. For example, Google translated “lorem ipsum” without capital letters as “China.” “Lorem Ipsum”, capitalized, produced “NATO.”  Check out his blog post for the entire list of seemingly-not-quite-random translations.

The researchers wondered if, perhaps, they had stumbled upon a secret code. Was it used by spies? Activists? Hackers? Perhaps it was meant to be a tunnel through China’s “Great Firewall.”  The truth is out there…but it will be a lot more difficult to uncover it now that Google has fixed the translations, which it did almost immediately after being notified of the issue.

Unfortunately, the most likely explanation is also the most mundane…it’s simply a bad machine translation caused by inadequate, poor quality data.

As ZDNet explained, because lorem ipsum is used as a placeholder,

“[T]here are millions of examples but very few actual translations of them; instead, the placeholder text will get matched up with documents that just look similar to the algorithm but aren’t actually connected. That would explain why you got different translations if you capitalised the words differently or duplicated them, resulting in translations like China, the Internet, NATO, the Company, China’s Internet, Business on the Internet, Home Business, Russia might be suffering, he is a smart consumer, the main focus of China, department and exam. Those are all common phrases – and you might recognise some of them from spammy web sites promising thousands of dollars for working from home or offering you answers to exam questions.”

Additionally, the standard lorem ipsum text is only one step above gibberish, anyway.

For her part, Kraeh3n, the researcher who discovered the “code,” told Krebs that she doesn’t believe it’s random:

“Translate [is] designed to be able to evolve and to learn from crowd-sourced input to reflect adaptations in language use over time,” Kraeh3n said. “Someone out there learned to game that ability and use an obscure piece of text no one in their right mind would ever type in to create totally random alternate meanings that could, potentially, be used to transmit messages covertly.

Meanwhile, TechCrunch is reporting that the odd translations were part of 1o57’s Defcon Badge puzzle.

What do you think?

Photo Credit: Attribution Some rights reserved by pkwahme

Newspaper Discovers Limits of Google Translate

In the United States, Spanish-speaking Latinos are a rapidly growing demographic. Naturally, some news organizations cater to them with Spanish-language editions, especially online.

However, according to Fox News, when the Hartford Courant decided to follow suit, they did not hire a translator, choosing instead to run all of their articles through Google Translate.

The results were about what you’d expect: embarrassing.

Former Hartford Courant columnist Bessy Reyna collected some of the most ridiculous examples of poor translation on her blog. Here are a couple of the juiciest nuggets of failure on display:

  • ”El hombre florero Over Head Smashed novia, policía dice” Literal translation: “The man flower vase Over Head Smashed Girlfriend, police said”
  • Este mujer Hartford acusado de apuñalar con el hombrepelador de patatas” which literally reads: “This woman Hartford Accused of stabbing the man with potato peeler.”

To address the criticism, the paper issued the following disclaimer:

“However, readers should be aware that due to limitations in the Google software some of the translations of the English headlines and articles don’t always translate accurately word-for-word into Spanish.”

Duh. On one level, it’s understandable that a local paper might not have the resources to devote to hiring a full-time Spanish translator. However, simply plugging all of their content into Google Translate appears to be counterproductive. According to Bessy Reyna, Latinos perceived the error-ridden translations as insulting, even offensive:

“Their reactions ranged from “This isn’t even Spanglish” to “Did you see the one today about Norwich? It’s to laugh and cry at the same time.” Others thought it was simply lack of respect and yet another way to humiliate the Latino community.”

The truth is, no matter what business you’re in, if you’re trying to communicate with customers in another language, there’s no substitute for a translator who knows both languages in and out. It’s impossible to put your best foot forward using Google Translate, or any other machine translation program for that matter!

Do you think newspapers should rely on Google Translate?

human translation vs machine translation

A Translation Showdown: Man vs Machine Translation

Computer scientists began trying to solve the problem of machine translation in the 1950s.  Since then, both the availability and quality of machine translation have improved tremendously. But in the battle of human translation vs machine translation, are humans now expendable?

Some scientists working on machine translation claim that with recent improvements, algorithms are almost as good at translation as humans.  And when the subject of “jobs that will soon be taken over by robots” comes up, futurists almost always put “translation” in the crosshairs.

But what happens when machines take on human translators? Earlier this month, Sejong Cyber University and the International Interpretation and Translation Association of Korea decided to find out. 3 machine translation programs went up against a group of human translators. It was a translation showdown: human translation vs machine translation.

Man versus machine, the translation industry’s version of the famous contest between John Henry and the steam-powered hammer  Guess who won? Read more