A novel approach to dropped pronoun translation

Longyue Wang; Zhaopeng Tu; Xiaojun Zhang; Hang Li; Andy Way; Qun Liu

doi:10.18653/v1/n16-1113

A novel approach to dropped pronoun translation

Longyue Wang, Zhaopeng Tu, Xiaojun Zhang, Hang Li, Andy Way, Qun Liu

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

37 Citations (Scopus)

Abstract

Dropped Pronouns (DP) in which pronouns are frequently dropped in the source language but should be retained in the target language are challenge in machine translation. In response to this problem, we propose a semi-supervised approach to recall possibly missing pronouns in the translation. Firstly, we build training data for DP generation in which the DPs are automatically labelled according to the alignment information from a parallel corpus. Secondly, we build a deep learning-based DP generator for input sentences in decoding when no corresponding references exist. More specifically, the generation is two-phase: (1) DP position detection, which is modeled as a sequential labelling task with recurrent neural networks; and (2) DP prediction, which employs a multilayer perceptron with rich features. Finally, we integrate the above outputs into our translation system to recall missing pronouns by both extracting rules from the DP-labelled training data and translating the DP-generated input sentences. Experimental results show that our approach achieves a significant improvement of 1.58 BLEU points in translation performance with 66% F-score for DP generation accuracy.

Original language	English
Title of host publication	2016 Conference of the North American Chapter of the Association for Computational Linguistics
Subtitle of host publication	Human Language Technologies, NAACL HLT 2016 - Proceedings of the Conference
Publisher	Association for Computational Linguistics (ACL)
Pages	983-993
Number of pages	11
ISBN (Electronic)	9781941643914
DOIs	https://doi.org/10.18653/v1/n16-1113
Publication status	Published - 2016
Externally published	Yes
Event	15th Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016 - San Diego, United States Duration: 12 Jun 2016 → 17 Jun 2016

Publication series

Name	2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016 - Proceedings of the Conference

Conference

Conference	15th Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016
Country/Territory	United States
City	San Diego
Period	12/06/16 → 17/06/16

Access to Document

10.18653/v1/n16-1113

Cite this

Wang, L., Tu, Z., Zhang, X., Li, H., Way, A., & Liu, Q. (2016). A novel approach to dropped pronoun translation. In 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016 - Proceedings of the Conference (pp. 983-993). (2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016 - Proceedings of the Conference). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/n16-1113

Wang, Longyue ; Tu, Zhaopeng ; Zhang, Xiaojun et al. / A novel approach to dropped pronoun translation. 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016 - Proceedings of the Conference. Association for Computational Linguistics (ACL), 2016. pp. 983-993 (2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016 - Proceedings of the Conference).

@inproceedings{ea55cb7ea5a3435da5bb9a61f8857099,

title = "A novel approach to dropped pronoun translation",

abstract = "Dropped Pronouns (DP) in which pronouns are frequently dropped in the source language but should be retained in the target language are challenge in machine translation. In response to this problem, we propose a semi-supervised approach to recall possibly missing pronouns in the translation. Firstly, we build training data for DP generation in which the DPs are automatically labelled according to the alignment information from a parallel corpus. Secondly, we build a deep learning-based DP generator for input sentences in decoding when no corresponding references exist. More specifically, the generation is two-phase: (1) DP position detection, which is modeled as a sequential labelling task with recurrent neural networks; and (2) DP prediction, which employs a multilayer perceptron with rich features. Finally, we integrate the above outputs into our translation system to recall missing pronouns by both extracting rules from the DP-labelled training data and translating the DP-generated input sentences. Experimental results show that our approach achieves a significant improvement of 1.58 BLEU points in translation performance with 66% F-score for DP generation accuracy.",

author = "Longyue Wang and Zhaopeng Tu and Xiaojun Zhang and Hang Li and Andy Way and Qun Liu",

note = "Publisher Copyright: {\textcopyright}2016 Association for Computational Linguistics.; 15th Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016 ; Conference date: 12-06-2016 Through 17-06-2016",

year = "2016",

doi = "10.18653/v1/n16-1113",

language = "English",

series = "2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016 - Proceedings of the Conference",

publisher = "Association for Computational Linguistics (ACL)",

pages = "983--993",

booktitle = "2016 Conference of the North American Chapter of the Association for Computational Linguistics",

}

Wang, L, Tu, Z, Zhang, X, Li, H, Way, A & Liu, Q 2016, A novel approach to dropped pronoun translation. in 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016 - Proceedings of the Conference. 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016 - Proceedings of the Conference, Association for Computational Linguistics (ACL), pp. 983-993, 15th Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016, San Diego, United States, 12/06/16. https://doi.org/10.18653/v1/n16-1113

A novel approach to dropped pronoun translation. / Wang, Longyue; Tu, Zhaopeng; Zhang, Xiaojun et al.
2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016 - Proceedings of the Conference. Association for Computational Linguistics (ACL), 2016. p. 983-993 (2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016 - Proceedings of the Conference).

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

TY - GEN

T1 - A novel approach to dropped pronoun translation

AU - Wang, Longyue

AU - Tu, Zhaopeng

AU - Zhang, Xiaojun

AU - Li, Hang

AU - Way, Andy

AU - Liu, Qun

PY - 2016

Y1 - 2016

N2 - Dropped Pronouns (DP) in which pronouns are frequently dropped in the source language but should be retained in the target language are challenge in machine translation. In response to this problem, we propose a semi-supervised approach to recall possibly missing pronouns in the translation. Firstly, we build training data for DP generation in which the DPs are automatically labelled according to the alignment information from a parallel corpus. Secondly, we build a deep learning-based DP generator for input sentences in decoding when no corresponding references exist. More specifically, the generation is two-phase: (1) DP position detection, which is modeled as a sequential labelling task with recurrent neural networks; and (2) DP prediction, which employs a multilayer perceptron with rich features. Finally, we integrate the above outputs into our translation system to recall missing pronouns by both extracting rules from the DP-labelled training data and translating the DP-generated input sentences. Experimental results show that our approach achieves a significant improvement of 1.58 BLEU points in translation performance with 66% F-score for DP generation accuracy.

AB - Dropped Pronouns (DP) in which pronouns are frequently dropped in the source language but should be retained in the target language are challenge in machine translation. In response to this problem, we propose a semi-supervised approach to recall possibly missing pronouns in the translation. Firstly, we build training data for DP generation in which the DPs are automatically labelled according to the alignment information from a parallel corpus. Secondly, we build a deep learning-based DP generator for input sentences in decoding when no corresponding references exist. More specifically, the generation is two-phase: (1) DP position detection, which is modeled as a sequential labelling task with recurrent neural networks; and (2) DP prediction, which employs a multilayer perceptron with rich features. Finally, we integrate the above outputs into our translation system to recall missing pronouns by both extracting rules from the DP-labelled training data and translating the DP-generated input sentences. Experimental results show that our approach achieves a significant improvement of 1.58 BLEU points in translation performance with 66% F-score for DP generation accuracy.

UR - http://www.scopus.com/inward/record.url?scp=84994155827&partnerID=8YFLogxK

U2 - 10.18653/v1/n16-1113

DO - 10.18653/v1/n16-1113

M3 - Conference Proceeding

AN - SCOPUS:84994155827

T3 - 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016 - Proceedings of the Conference

SP - 983

EP - 993

BT - 2016 Conference of the North American Chapter of the Association for Computational Linguistics

PB - Association for Computational Linguistics (ACL)

T2 - 15th Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016

Y2 - 12 June 2016 through 17 June 2016

ER -

Wang L, Tu Z, Zhang X, Li H, Way A, Liu Q. A novel approach to dropped pronoun translation. In 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016 - Proceedings of the Conference. Association for Computational Linguistics (ACL). 2016. p. 983-993. (2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016 - Proceedings of the Conference). doi: 10.18653/v1/n16-1113

A novel approach to dropped pronoun translation

Abstract

Publication series

Conference

Access to Document

Other files and links

Cite this