Collaborative learning for language and speaker recognition

Lantian Li, Zhiyuan Tang, Dong Wang*, Andrew Abel, Yang Feng, Shiyue Zhang

*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

2 Citations (Scopus)

Abstract

This paper presents a unified model to perform language and speaker recognition simultaneously and together. This model is based on a multi-task recurrent neural network, where the output of one task is fed in as the input of the other, leading to a collaborative learning framework that can improve both language and speaker recognition by sharing information between the tasks. The preliminary experiments presented in this paper demonstrate that the multi-task model outperforms similar task-specific models on both language and speaker tasks. The language recognition improvement is especially remarkable, which we believe is due to the speaker normalization effect caused by using the information from the speaker recognition component.

Original languageEnglish
Title of host publicationMan-Machine Speech Communication - 14th National Conference, NCMMSC 2017, Revised Selected Papers
EditorsYa Li, Thomas Fang Zheng, Changchun Bao, Dong Wang, Jianhua Tao
PublisherSpringer Verlag
Pages58-69
Number of pages12
ISBN (Print)9789811081101
DOIs
Publication statusPublished - 2018
Event14th National Conference on Man-Machine Speech Communication, NCMMSC 2017 - Lianyungang, China
Duration: 11 Oct 201713 Oct 2017

Publication series

NameCommunications in Computer and Information Science
Volume807
ISSN (Print)1865-0929

Conference

Conference14th National Conference on Man-Machine Speech Communication, NCMMSC 2017
Country/TerritoryChina
CityLianyungang
Period11/10/1713/10/17

Fingerprint

Dive into the research topics of 'Collaborative learning for language and speaker recognition'. Together they form a unique fingerprint.

Cite this