TY - GEN
T1 - LEARNABLE NONLINEAR COMPRESSION FOR ROBUST SPEAKER VERIFICATION
AU - Liu, Xuechen
AU - Sahidullah, Md
AU - Kinnunen, Tomi
N1 - Publisher Copyright:
© 2022 IEEE
PY - 2022
Y1 - 2022
N2 - In this study, we focus on nonlinear compression methods in spectral features for speaker verification based on deep neural network. We consider different kinds of channel-dependent (CD) nonlinear compression methods optimized in a data-driven manner. Our methods are based on power nonlinearities and dynamic range compression (DRC). We also propose multi-regime (MR) design on the nonlinearities, at improving robustness. Results on VoxCeleb1 and VoxMovies data demonstrate improvements brought by proposed compression methods over both the commonly-used logarithm and their static counterparts, especially for ones based on power function. While CD generalization improves performance on VoxCeleb1, MR provides more robustness on VoxMovies, with a maximum relative equal error rate reduction of 21.6%.
AB - In this study, we focus on nonlinear compression methods in spectral features for speaker verification based on deep neural network. We consider different kinds of channel-dependent (CD) nonlinear compression methods optimized in a data-driven manner. Our methods are based on power nonlinearities and dynamic range compression (DRC). We also propose multi-regime (MR) design on the nonlinearities, at improving robustness. Results on VoxCeleb1 and VoxMovies data demonstrate improvements brought by proposed compression methods over both the commonly-used logarithm and their static counterparts, especially for ones based on power function. While CD generalization improves performance on VoxCeleb1, MR provides more robustness on VoxMovies, with a maximum relative equal error rate reduction of 21.6%.
KW - Multi-Regime Compression
KW - Nonlinear Compression
KW - Speaker Verification
UR - https://www.scopus.com/pages/publications/85131255463
U2 - 10.1109/ICASSP43922.2022.9747185
DO - 10.1109/ICASSP43922.2022.9747185
M3 - Conference Proceeding
AN - SCOPUS:85131255463
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 7962
EP - 7966
BT - 2022 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2022
Y2 - 22 May 2022 through 27 May 2022
ER -