Prototypical networks for small footprint text-independent speaker verification

Tom Ko; Yangbin Chen; Jianping Wang

doi:10.1109/ICASSP40776.2020.9054471

Prototypical networks for small footprint text-independent speaker verification

Tom Ko, Yangbin Chen, Jianping Wang

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

19 Citations (Scopus)

Abstract

Speaker verification aims to recognize target speakers with very few enrollment utterances. Conventional approaches learn a representation model to extract the speaker embeddings for verification. Recently, there are several new approaches in meta-learning which try to learn a shared metric space. Among these approaches, prototypical networks aim at learning a non-linear mapping from the input space to an embedding space with a predefined distance metric. In this paper, we investigate the use of prototypical networks in a small footprint text-independent speaker verification task. Our work is evaluated on SRE10 evaluation set. Experiments show that prototypical networks outperform the conventional method when the amount of data per training speaker is limited.

Original language	English
Title of host publication	IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Pages	6804-6808
DOIs	https://doi.org/10.1109/ICASSP40776.2020.9054471
Publication status	Published - 2020
Externally published	Yes

Access to Document

10.1109/ICASSP40776.2020.9054471

Cite this

@inproceedings{9220d5e7133a40dea5157b42a79c3800,

title = "Prototypical networks for small footprint text-independent speaker verification",

abstract = "Speaker verification aims to recognize target speakers with very few enrollment utterances. Conventional approaches learn a representation model to extract the speaker embeddings for verification. Recently, there are several new approaches in meta-learning which try to learn a shared metric space. Among these approaches, prototypical networks aim at learning a non-linear mapping from the input space to an embedding space with a predefined distance metric. In this paper, we investigate the use of prototypical networks in a small footprint text-independent speaker verification task. Our work is evaluated on SRE10 evaluation set. Experiments show that prototypical networks outperform the conventional method when the amount of data per training speaker is limited.",

author = "Tom Ko and Yangbin Chen and Jianping Wang",

year = "2020",

doi = "10.1109/ICASSP40776.2020.9054471",

language = "English",

pages = "6804--6808",

booktitle = "IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)",

}

TY - GEN

T1 - Prototypical networks for small footprint text-independent speaker verification

AU - Ko, Tom

AU - Chen, Yangbin

AU - Wang, Jianping

PY - 2020

Y1 - 2020

N2 - Speaker verification aims to recognize target speakers with very few enrollment utterances. Conventional approaches learn a representation model to extract the speaker embeddings for verification. Recently, there are several new approaches in meta-learning which try to learn a shared metric space. Among these approaches, prototypical networks aim at learning a non-linear mapping from the input space to an embedding space with a predefined distance metric. In this paper, we investigate the use of prototypical networks in a small footprint text-independent speaker verification task. Our work is evaluated on SRE10 evaluation set. Experiments show that prototypical networks outperform the conventional method when the amount of data per training speaker is limited.

AB - Speaker verification aims to recognize target speakers with very few enrollment utterances. Conventional approaches learn a representation model to extract the speaker embeddings for verification. Recently, there are several new approaches in meta-learning which try to learn a shared metric space. Among these approaches, prototypical networks aim at learning a non-linear mapping from the input space to an embedding space with a predefined distance metric. In this paper, we investigate the use of prototypical networks in a small footprint text-independent speaker verification task. Our work is evaluated on SRE10 evaluation set. Experiments show that prototypical networks outperform the conventional method when the amount of data per training speaker is limited.

U2 - 10.1109/ICASSP40776.2020.9054471

DO - 10.1109/ICASSP40776.2020.9054471

M3 - Conference Proceeding

SP - 6804

EP - 6808

BT - IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

ER -

Prototypical networks for small footprint text-independent speaker verification

Abstract

Access to Document

Fingerprint

Cite this