A Reality-Hype Distinction? Honest people working hard to catch fake news with hands in the cookie jar

Fake news tells us that neural machine translation (NMT) is Nearly Indistinguishable From Human Translation,” but honest people are working hard to distinguish reality from the hype.

Terence Lewis is a veteran translator, lexicographer, author, and creator of the Trasy Dutch-English machine translation program and Dutch-to-English NMT service, MyDutchPal. He openly shares NMT’s pros and cons in this article written by Gábor Ugray, We wanted a Frankenstein translator and ended up with a bilingual chatbot. Gábor is Head of Innovation at Kilgray Translation Technologies and also works to spread the truth about NMT. Also, Kirti Vashee, a long-time machine translation pundit, does his best to share a balanced perspective on his blog eMpTy Pages.

Arle Lommel, a senior analyst for Common Sense Advisory (CSA Research), wrote a great article about NMT published on pages 52 to 54 of Multilingual Magazine. His article, Why zero-shot translation may be the most important MT development in localization, reviews an unexpected anomaly in neural network technology that Google calls zero-shot translation. It promises to extend language technology to under-served minority language pairings.

Zero-shot Translation Anomaly

Digging deeper into Arle’s article, I agree that zero-shot translation is a really promising capability. It is also one of several neural network anomalies that researchers have discovered by surprise. Here’s a consolidated list the anomalies I’ve found in the media:

Maybe these anomalies all trace to the same thing in neural networks’ design; maybe not. Each of them holds a promise of something really cool to come. For today, NMT’s results are great for tourists visiting Cambodia when they order a coffee, and NMT’s anomalies pose serious problems for professionals who try to use NMT for their work. Researchers don’t yet understand these anomalies and it will take some time for them to control this technology for predictable professional results.

Digging Deeper

So, I decided to rewind the clock to academic research published before the hype started. That took me to this Workshop on Learning Technologies (European Committee for Standardization) evaluation paper published in October 2016: Neural versus Phrase-Based Machine Translation Quality: a Case Study. Note that their tests’ phrase-based machine translation (PBMT) includes several varieties of statistical machine translation (SMT) methods for the English-German language pair. I’ll quote article’s summary of findings:

  1. NMT generates outputs that considerably lower the overall post-edit effort with respect to the best PBMT system (-26%);
  2. NMT outperforms PBMT systems on all sentence lengths, although its performance degrades faster with the input length than its competitors;
  3. NMT seems to have an edge especially on lexically rich texts;
  4. NMT output contains less morphology errors (-19%), less lexical errors (-17%), and substantially less word order errors (-50%) than its closest competitor for each error type;
  5. concerning word order, NMT shows an impressive improvement in the placement of verbs (-70% errors)

NMT Alternative

The workshop’s NMT started with a 26% improvement in “overall post-edit effort.” There’s no doubt in my mind that it will continue to improve. The question is, why does SMT as a personalized translation engine outperform NMT and NMT outperforms big-data SMT?

A football player learns new playing rules and achieves different results as he transitions from little league to professional, but every league has football games. Likewise, SMT’s playing rules and the results it achieves change as we apply the technology in different use cases. Here’s an eye-opening demonstration.

From the Workshop on Learning Technologies:

system BLEU  Evaluation Set (en-de)
Standard PBMT 25.8  collection of TED Talks short speeches
NMT 31.1  collection of TED Talks short speeches

The table above shows the 17% improvement of the NMT system’s BLEU score over the big-data PBMT baseline with workshop’s evaluation set. Compare that to the 109% improvement from Google’s NMT to a Slate Desktop user’s SMT engine in the table below.

system BLEU  Evaluation Set (en-it)
Google NMT 33.1  Representative sample of Isabella’s work
Isabella’s TMs 69.3  Representative sample of Isabella’s work

These are Isabella Massardo’s Slate Desktop results she reported in her blog article [Review] Who Is A Translator’s New Best Friend?. For her review, she converted her translation memories to create a personalized translation engine and used it with the same client’s work.

In January 2017, Google CEO Sundar Pichai was referring to NMT when he told investors, “We have improved our translation ability more in one single year than all our improvements over the last 10 years combined.” Clearly, big-data NMT is a significant improvement over online big-data SMT services but it is not a revolutionary improvement for our language industry.

When a translator properly uses a personalized translation engine as intended, the engine will out-perform big-data. We need to continue to explore these new approaches to augment and enhance human translation.

Are there any ideas that you agree with? Any that you would add? Please add them to the comments box below!

If you enjoyed this post please do click LIKE. Click SHARE to share it with your network. Thank you!

If you enjoyed this post please take time to read some of my related posts below.

About the author: Tom Hoar is the Founder and Owner of Slate Rocks, LLC, a pioneer changing the translation ecosystem with software that empowers professionals make quality translations easier. With many years technology leadership and a tenacious passion providing technical support to professional translators, he’s become a true industry resource. Tom writes regular posts and blogs on translation technology. Tom is available for technology coaching, training, and keynote speaking. Check out his profile for more information.

  • 8
  • 22
  • 9
One response... add one

My apologies to Gábor Ugray for not acknowledging his voice distinguishing reality from the hype. Gábor, thanks for the private message. — Tom

Please share what's on your mind.