Collaborative learning for language and speaker recognition

Lantian Li; Zhiyuan Tang; Dong Wang; Andrew Abel; Yang Feng; Shiyue Zhang

doi:10.1007/978-981-10-8111-8_6

Collaborative learning for language and speaker recognition

Lantian Li, Zhiyuan Tang, Dong Wang^*, Andrew Abel, Yang Feng, Shiyue Zhang

^*Corresponding author for this work

Department of Computing

Tsinghua University

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

2 Citations (Scopus)

Abstract

This paper presents a unified model to perform language and speaker recognition simultaneously and together. This model is based on a multi-task recurrent neural network, where the output of one task is fed in as the input of the other, leading to a collaborative learning framework that can improve both language and speaker recognition by sharing information between the tasks. The preliminary experiments presented in this paper demonstrate that the multi-task model outperforms similar task-specific models on both language and speaker tasks. The language recognition improvement is especially remarkable, which we believe is due to the speaker normalization effect caused by using the information from the speaker recognition component.

Original language	English
Title of host publication	Man-Machine Speech Communication - 14th National Conference, NCMMSC 2017, Revised Selected Papers
Editors	Ya Li, Thomas Fang Zheng, Changchun Bao, Dong Wang, Jianhua Tao
Publisher	Springer Verlag
Pages	58-69
Number of pages	12
ISBN (Print)	9789811081101
DOIs	https://doi.org/10.1007/978-981-10-8111-8_6
Publication status	Published - 2018
Event	14th National Conference on Man-Machine Speech Communication, NCMMSC 2017 - Lianyungang, China Duration: 11 Oct 2017 → 13 Oct 2017

Publication series

Name	Communications in Computer and Information Science
Volume	807
ISSN (Print)	1865-0929

Conference

Conference	14th National Conference on Man-Machine Speech Communication, NCMMSC 2017
Country/Territory	China
City	Lianyungang
Period	11/10/17 → 13/10/17

Access to Document

10.1007/978-981-10-8111-8_6

Cite this

Li, L., Tang, Z., Wang, D., Abel, A., Feng, Y., & Zhang, S. (2018). Collaborative learning for language and speaker recognition. In Y. Li, T. F. Zheng, C. Bao, D. Wang, & J. Tao (Eds.), Man-Machine Speech Communication - 14th National Conference, NCMMSC 2017, Revised Selected Papers (pp. 58-69). (Communications in Computer and Information Science; Vol. 807). Springer Verlag. https://doi.org/10.1007/978-981-10-8111-8_6

Li, Lantian ; Tang, Zhiyuan ; Wang, Dong et al. / Collaborative learning for language and speaker recognition. Man-Machine Speech Communication - 14th National Conference, NCMMSC 2017, Revised Selected Papers. editor / Ya Li ; Thomas Fang Zheng ; Changchun Bao ; Dong Wang ; Jianhua Tao. Springer Verlag, 2018. pp. 58-69 (Communications in Computer and Information Science).

@inproceedings{fcd40c76b5244f209c8b63a0c90f038d,

title = "Collaborative learning for language and speaker recognition",

abstract = "This paper presents a unified model to perform language and speaker recognition simultaneously and together. This model is based on a multi-task recurrent neural network, where the output of one task is fed in as the input of the other, leading to a collaborative learning framework that can improve both language and speaker recognition by sharing information between the tasks. The preliminary experiments presented in this paper demonstrate that the multi-task model outperforms similar task-specific models on both language and speaker tasks. The language recognition improvement is especially remarkable, which we believe is due to the speaker normalization effect caused by using the information from the speaker recognition component.",

author = "Lantian Li and Zhiyuan Tang and Dong Wang and Andrew Abel and Yang Feng and Shiyue Zhang",

note = "Publisher Copyright: {\textcopyright} 2018, Springer Nature Singapore Pte Ltd.; 14th National Conference on Man-Machine Speech Communication, NCMMSC 2017 ; Conference date: 11-10-2017 Through 13-10-2017",

year = "2018",

doi = "10.1007/978-981-10-8111-8_6",

language = "English",

isbn = "9789811081101",

series = "Communications in Computer and Information Science",

publisher = "Springer Verlag",

pages = "58--69",

editor = "Ya Li and Zheng, {Thomas Fang} and Changchun Bao and Dong Wang and Jianhua Tao",

booktitle = "Man-Machine Speech Communication - 14th National Conference, NCMMSC 2017, Revised Selected Papers",

}

Li, L, Tang, Z, Wang, D, Abel, A, Feng, Y & Zhang, S 2018, Collaborative learning for language and speaker recognition. in Y Li, TF Zheng, C Bao, D Wang & J Tao (eds), Man-Machine Speech Communication - 14th National Conference, NCMMSC 2017, Revised Selected Papers. Communications in Computer and Information Science, vol. 807, Springer Verlag, pp. 58-69, 14th National Conference on Man-Machine Speech Communication, NCMMSC 2017, Lianyungang, China, 11/10/17. https://doi.org/10.1007/978-981-10-8111-8_6

Collaborative learning for language and speaker recognition. / Li, Lantian; Tang, Zhiyuan; Wang, Dong et al.
Man-Machine Speech Communication - 14th National Conference, NCMMSC 2017, Revised Selected Papers. ed. / Ya Li; Thomas Fang Zheng; Changchun Bao; Dong Wang; Jianhua Tao. Springer Verlag, 2018. p. 58-69 (Communications in Computer and Information Science; Vol. 807).

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

TY - GEN

T1 - Collaborative learning for language and speaker recognition

AU - Li, Lantian

AU - Tang, Zhiyuan

AU - Wang, Dong

AU - Abel, Andrew

AU - Feng, Yang

AU - Zhang, Shiyue

PY - 2018

Y1 - 2018

N2 - This paper presents a unified model to perform language and speaker recognition simultaneously and together. This model is based on a multi-task recurrent neural network, where the output of one task is fed in as the input of the other, leading to a collaborative learning framework that can improve both language and speaker recognition by sharing information between the tasks. The preliminary experiments presented in this paper demonstrate that the multi-task model outperforms similar task-specific models on both language and speaker tasks. The language recognition improvement is especially remarkable, which we believe is due to the speaker normalization effect caused by using the information from the speaker recognition component.

AB - This paper presents a unified model to perform language and speaker recognition simultaneously and together. This model is based on a multi-task recurrent neural network, where the output of one task is fed in as the input of the other, leading to a collaborative learning framework that can improve both language and speaker recognition by sharing information between the tasks. The preliminary experiments presented in this paper demonstrate that the multi-task model outperforms similar task-specific models on both language and speaker tasks. The language recognition improvement is especially remarkable, which we believe is due to the speaker normalization effect caused by using the information from the speaker recognition component.

UR - http://www.scopus.com/inward/record.url?scp=85042070935&partnerID=8YFLogxK

U2 - 10.1007/978-981-10-8111-8_6

DO - 10.1007/978-981-10-8111-8_6

M3 - Conference Proceeding

AN - SCOPUS:85042070935

SN - 9789811081101

T3 - Communications in Computer and Information Science

SP - 58

EP - 69

BT - Man-Machine Speech Communication - 14th National Conference, NCMMSC 2017, Revised Selected Papers

A2 - Li, Ya

A2 - Zheng, Thomas Fang

A2 - Bao, Changchun

A2 - Wang, Dong

A2 - Tao, Jianhua

PB - Springer Verlag

T2 - 14th National Conference on Man-Machine Speech Communication, NCMMSC 2017

Y2 - 11 October 2017 through 13 October 2017

ER -

Li L, Tang Z, Wang D, Abel A, Feng Y, Zhang S. Collaborative learning for language and speaker recognition. In Li Y, Zheng TF, Bao C, Wang D, Tao J, editors, Man-Machine Speech Communication - 14th National Conference, NCMMSC 2017, Revised Selected Papers. Springer Verlag. 2018. p. 58-69. (Communications in Computer and Information Science). doi: 10.1007/978-981-10-8111-8_6

Collaborative learning for language and speaker recognition

Abstract

Publication series

Conference

Access to Document

Other files and links

Fingerprint

Cite this