基于优化浮值掩蔽的监督性语音分离

Translated title of the contribution: Supervised Speech Separation Using Optimal Ratio Mask

Sha Sha Xia, Xue Liang Zhang*, Shan Liang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Supervised speech separation uses a supervised learning algorithm to learn a mapping from an input noisy signal to an output target signal. In recent years, due to the development of deep learning, supervised separation algorithm has become the most important research direction in speech separation area and the training target has a significant impact on the performance of the speech separation algorithm. Ideal ratio mask is a commonly used training target, which can improve speech intelligibility and quality of the separated speech. However, it does not take into account the correlation between noise and clean speech. In this paper, we use an optimal ratio mask as the training target, and use the deep neural network (DNN) as the separation model. Experiments are carried out under various noise environments and signal to noise ratio conditions, and the results show that the optimal ratio mask outperforms other training targets in general.

Translated title of the contributionSupervised Speech Separation Using Optimal Ratio Mask
Original languageChinese (Traditional)
Pages (from-to)1876-1887
Number of pages12
JournalZidonghua Xuebao/Acta Automatica Sinica
Volume44
Issue number10
DOIs
Publication statusPublished - Oct 2018
Externally publishedYes

Keywords

  • Deep neural network (DNN)
  • Speech separation
  • Supervised learning
  • Training targets

Fingerprint

Dive into the research topics of 'Supervised Speech Separation Using Optimal Ratio Mask'. Together they form a unique fingerprint.

Cite this