TY - GEN
T1 - Memory-augmented Neural Machine Translation
AU - Feng, Yang
AU - Zhang, Shiyue
AU - Zhang, Andi
AU - Wang, Dong
AU - Abel, Andrew
N1 - Publisher Copyright:
© 2017 Association for Computational Linguistics.
PY - 2017
Y1 - 2017
N2 - Neural machine translation (NMT) has achieved notable success in recent times, however it is also widely recognized that this approach has limitations with handling infrequent words and word pairs. This paper presents a novel memory-augmented NMT (M-NMT) architecture, which stores knowledge about how words (usually infrequently encountered ones) should be translated in a memory and then utilizes them to assist the neural model. We use this memory mechanism to combine the knowledge learned from a conventional statistical machine translation system and the rules learned by an NMT system, and also propose a solution for out-of-vocabulary (OOV) words based on this framework. Our experiments on two Chinese-English translation tasks demonstrated that the M-NMT architecture outperformed the NMT baseline by 9.0 and 2.7 BLEU points on the two tasks, respectively. Additionally, we found this architecture resulted in a much more effective OOV treatment compared to competitive methods.
AB - Neural machine translation (NMT) has achieved notable success in recent times, however it is also widely recognized that this approach has limitations with handling infrequent words and word pairs. This paper presents a novel memory-augmented NMT (M-NMT) architecture, which stores knowledge about how words (usually infrequently encountered ones) should be translated in a memory and then utilizes them to assist the neural model. We use this memory mechanism to combine the knowledge learned from a conventional statistical machine translation system and the rules learned by an NMT system, and also propose a solution for out-of-vocabulary (OOV) words based on this framework. Our experiments on two Chinese-English translation tasks demonstrated that the M-NMT architecture outperformed the NMT baseline by 9.0 and 2.7 BLEU points on the two tasks, respectively. Additionally, we found this architecture resulted in a much more effective OOV treatment compared to competitive methods.
UR - http://www.scopus.com/inward/record.url?scp=85073170444&partnerID=8YFLogxK
U2 - 10.18653/v1/d17-1146
DO - 10.18653/v1/d17-1146
M3 - Conference Proceeding
AN - SCOPUS:85073170444
T3 - EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings
SP - 1390
EP - 1399
BT - EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings
PB - Association for Computational Linguistics (ACL)
T2 - 2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017
Y2 - 9 September 2017 through 11 September 2017
ER -