Dropped pronoun generation for dialogue machine translation

Longyue Wang, Xiaojun Zhang, Zhaopeng Tu, Hang Li, Qun Liu

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

11 Citations (Scopus)

Abstract

Dropped pronoun (DP) is a common problem in dialogue machine translation, in which pronouns are frequently dropped in the source sentence and thus are missing in its translation. In response to this problem, we propose a novel approach to improve the translation of DPs for dialogue machine translation. Firstly, we build a training data for DP generation, in which the DPs are automatically added according to the alignment information from a parallel corpus. Then we model the DP generation problem as a sequence labelling task, and develop a generation model based on recurrent neural networks and language models. Finally, we apply the DP generator to machine translation task by completing the source sentences with the missing pronouns. Experimental results show that our approach achieves a significant improvement of 1.7 BLEU points by recalling possible DPs in the source sentences.

Original languageEnglish
Title of host publication2016 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages6110-6114
Number of pages5
ISBN (Electronic)9781479999880
DOIs
Publication statusPublished - 18 May 2016
Externally publishedYes
Event41st IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016 - Shanghai, China
Duration: 20 Mar 201625 Mar 2016

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2016-May
ISSN (Print)1520-6149

Conference

Conference41st IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016
Country/TerritoryChina
CityShanghai
Period20/03/1625/03/16

Keywords

  • Dialogue
  • Dropped Pronoun
  • Machine Translation

Cite this