65 - Machine Translation

Translation by machines to perfectly transform text into another language.

Rob van Zoest
Founder @ innerdoc.com | NLP Expert-Engineer-Enthusiast | Writes about how to get value from textual data | Lives in the Netherlands | Loves to travel around the globe | Dutchman | rob@innerdoc.com
More posts by Rob van Zoest.

Rob van Zoest

05 Nov 2020• 1 min read

Language translation by machines is since decades one of the most important NLP-tasks, because all things start by understanding each other without barriers. Google Translate still has shortcomings and is the absolute leader, but Facebook is in the race with it’s multilingual machine translation model M2M-100.

However, custom-build models are within range with the arrival of Neural Machine Translation implementations, which provide sequence-to-sequence models and Parallel Corpora like Paracrawl and Opus.

With the growing quality of Machine Translation models, there is also an opportunity to better translate training datasets into another language. The English language is almost always used for NLP-blogs, model demo’s and SOTA leaderboards. These superior resources might benefit you for other languages.

In the figure below a Word Alignment matrix from a Neural Machine Translation task. Each pixel shows the weight of the annotation and explains which positions in the source sentence were considered more important when generating the target word.

^{Word Alignment matrix for a translated sentence (source)}

This article is part of the project Periodic Table of NLP Tasks. Click to read more about the making of the Periodic Table and the project to systemize NLP tasks.

65 - Machine Translation

Rob van Zoest

Rob van Zoest

68 - Long Text Generation

67 - Paraphrasing

66 - Abstractive Summarization

64 - Report Writing

66 - Abstractive Summarization