Memory-augmented Neural Machine Translation

Yang Feng, Shiyue Zhang, Andi Zhang, Dong Wang*, Andrew Abel

*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

38 Citations (Scopus)

Abstract

Neural machine translation (NMT) has achieved notable success in recent times, however it is also widely recognized that this approach has limitations with handling infrequent words and word pairs. This paper presents a novel memory-augmented NMT (M-NMT) architecture, which stores knowledge about how words (usually infrequently encountered ones) should be translated in a memory and then utilizes them to assist the neural model. We use this memory mechanism to combine the knowledge learned from a conventional statistical machine translation system and the rules learned by an NMT system, and also propose a solution for out-of-vocabulary (OOV) words based on this framework. Our experiments on two Chinese-English translation tasks demonstrated that the M-NMT architecture outperformed the NMT baseline by 9.0 and 2.7 BLEU points on the two tasks, respectively. Additionally, we found this architecture resulted in a much more effective OOV treatment compared to competitive methods.

Original languageEnglish
Title of host publicationEMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings
PublisherAssociation for Computational Linguistics (ACL)
Pages1390-1399
Number of pages10
ISBN (Electronic)9781945626838
DOIs
Publication statusPublished - 2017
Event2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017 - Copenhagen, Denmark
Duration: 9 Sept 201711 Sept 2017

Publication series

NameEMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings

Conference

Conference2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017
Country/TerritoryDenmark
CityCopenhagen
Period9/09/1711/09/17

Fingerprint

Dive into the research topics of 'Memory-augmented Neural Machine Translation'. Together they form a unique fingerprint.

Cite this