At a presentation in China, Microsoft recently demonstrated an improved machine translation technology that allows for real-time translation in your own voice. Using the system, Microsoft’s Chief Research Officer Rick Rashid was able to give a presentation in Chinese in his own voice, even though he doesn’t speak the language.
How does it work? Prompted by the attention his presentation generated, Rashid wrote a blog post to explain the technology behind the system:
“In my presentation, I showed how we take the text that represents my speech and run it through translation- in this case, turning my English into Chinese in two steps. The first takes my words and finds the Chinese equivalents, and while non-trivial, this is the easy part. The second reorders the words to be appropriate for Chinese, an important step for correct translation between languages.”
To more accurately perform the first step of the process, Microsoft is using a technique called Deep Neural Networks, which it says mimics the patterns of the human brain to make speech recognition more accurate. One caveat: the Deep Neural Networks technology may be better at recognizing words, but it’s still no substitute for the brain of a skilled interpreter. As Rashid wrote:
“We have been able to reduce the word error rate for speech by over 30% compared to previous methods. This means that rather than having one word in 4 or 5 incorrect, now the error rate is one word in 7 or 8…Of course, there are still likely to be errors in both the English text and the translation into Chinese, and the results can sometimes be humorous. Still, the technology has developed to be quite useful.”
While this technology is amazing and will definitely have its uses in the future, I doubt it will replace the knowledge and understanding that a trained translator brings to the job any time soon. What do you think?