Skip to main navigation Skip to search Skip to main content

Optimizing Multi-Taper Features for Deep Speaker Verification

  • Université de Lorraine
  • University of Eastern Finland

Research output: Contribution to journalArticlepeer-review

3 Citations (Scopus)

Abstract

Multi-Taper estimators provide low-variance power spectrum estimates that can be used in place of the windowed discrete Fourier transform (DFT) to extract speech features such as mel-frequency cepstral coefficients (MFCCs). Even if past work has reported promising automatic speaker verification (ASV) results with Gaussian mixture model-based classifiers, the performance of multi-Taper MFCCs with deep ASV systems remains an open question. Instead of a static-Taper design, we propose to optimize the multi-Taper estimator jointly with a deep neural network trained for ASV tasks. With a maximum improvement on the SITW corpus of 25.8% in terms of equal error rate over the static-Taper, our method helps preserve a balanced level of leakage and variance, providing more robustness.

Original languageEnglish
Pages (from-to)2187-2191
Number of pages5
JournalIEEE Signal Processing Letters
Volume28
DOIs
Publication statusPublished - 2021
Externally publishedYes

Keywords

  • Multi-Taper spectrum
  • speaker verification

Cite this