TY - GEN
T1 - Echo Cancelation and Noise Suppression by Training a Dual-Stream Recurrent Network with a Mixture of Training Targets
AU - Alishahi, Fatemeh
AU - Cao, Yin
AU - Kim, Youngkoen
AU - Mohammad, Asif
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Nonlinear echo in presence of background noise can degrade the performance of digital signal processing algorithms. Deep neural networks with their ability to model complex nonlinear functions can potentially address this issue. In this paper, a deep and causal neural network based on dual streaming of the near-end microphone and far-end speech signals is employed to leverage the real-time nonlinear echo cancellation and noise suppression. The extracted features of two streams are coupled into a shared neural network for joint echo and noise cancellation. The training target is a mixture of spectral mapping and masking-based targets which are gated through a feedforward neural network. The model is evaluated in terms of both signal-level and perception-level metrics for different scenarios with a range of SI-SDR as low as -25 dB. Furthermore, the effect of mixing of training targets is assessed by evaluating different models.
AB - Nonlinear echo in presence of background noise can degrade the performance of digital signal processing algorithms. Deep neural networks with their ability to model complex nonlinear functions can potentially address this issue. In this paper, a deep and causal neural network based on dual streaming of the near-end microphone and far-end speech signals is employed to leverage the real-time nonlinear echo cancellation and noise suppression. The extracted features of two streams are coupled into a shared neural network for joint echo and noise cancellation. The training target is a mixture of spectral mapping and masking-based targets which are gated through a feedforward neural network. The model is evaluated in terms of both signal-level and perception-level metrics for different scenarios with a range of SI-SDR as low as -25 dB. Furthermore, the effect of mixing of training targets is assessed by evaluating different models.
KW - Supervised speech enhancement
KW - deep neural network
KW - recurrent neural networks
KW - training targets
UR - http://www.scopus.com/inward/record.url?scp=85141356722&partnerID=8YFLogxK
U2 - 10.1109/IWAENC53105.2022.9914701
DO - 10.1109/IWAENC53105.2022.9914701
M3 - Conference Proceeding
AN - SCOPUS:85141356722
T3 - International Workshop on Acoustic Signal Enhancement, IWAENC 2022 - Proceedings
BT - International Workshop on Acoustic Signal Enhancement, IWAENC 2022 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 17th International Workshop on Acoustic Signal Enhancement, IWAENC 2022
Y2 - 5 September 2022 through 8 September 2022
ER -