The Google Code

Please Share:

Last week, the Internet lit up with the news that a “secret code” had been discovered in Google Translate. But was it really a secret message, or just another bad translation?

Much of the time, Google Translate will provide an imperfect but serviceable translation. However, sometimes it comes up with automatically generated translations that are so bad, they seem uncanny.

The story of the “secret code” was originally published on the Krebs On Security blog.  A few months back, researchers from a couple of different security firms approached computer security reporter Bryan Krebs with an intriguing discovery: putting the traditional “Lorem Ipsum” placeholder text into Google Translate yielded some very strange, politically tinged results. For example, Google translated “lorem ipsum” without capital letters as “China.” “Lorem Ipsum”, capitalized, produced “NATO.”  Check out his blog post for the entire list of seemingly-not-quite-random translations.

The researchers wondered if, perhaps, they had stumbled upon a secret code. Was it used by spies? Activists? Hackers? Perhaps it was meant to be a tunnel through China’s “Great Firewall.”  The truth is out there…but it will be a lot more difficult to uncover it now that Google has fixed the translations, which it did almost immediately after being notified of the issue.

Unfortunately, the most likely explanation is also the most mundane…it’s simply a bad machine translation caused by inadequate, poor quality data.

As ZDNet explained, because lorem ipsum is used as a placeholder,

“[T]here are millions of examples but very few actual translations of them; instead, the placeholder text will get matched up with documents that just look similar to the algorithm but aren’t actually connected. That would explain why you got different translations if you capitalised the words differently or duplicated them, resulting in translations like China, the Internet, NATO, the Company, China’s Internet, Business on the Internet, Home Business, Russia might be suffering, he is a smart consumer, the main focus of China, department and exam. Those are all common phrases – and you might recognise some of them from spammy web sites promising thousands of dollars for working from home or offering you answers to exam questions.”

Additionally, the standard lorem ipsum text is only one step above gibberish, anyway.

For her part, Kraeh3n, the researcher who discovered the “code,” told Krebs that she doesn’t believe it’s random:

“Translate [is] designed to be able to evolve and to learn from crowd-sourced input to reflect adaptations in language use over time,” Kraeh3n said. “Someone out there learned to game that ability and use an obscure piece of text no one in their right mind would ever type in to create totally random alternate meanings that could, potentially, be used to transmit messages covertly.

Meanwhile, TechCrunch is reporting that the odd translations were part of 1o57’s Defcon Badge puzzle.