Deep learning based speech separation technology and its developments

Wen Ju Liu*, Shuai Nie, Shan Liang, Xue Liang Zhang

*Corresponding author for this work

Research output: Contribution to journalReview articlepeer-review

32 Citations (Scopus)

Abstract

Nowadays, speech interaction technology has been widely used in our daily life. However, due to the interfer- ences, the performances of speech interaction systems in real-world environments are far from being satisfactory. Speech separation technology has been proven to be an effective way to improve the performance of speech interaction in noisy environments. To this end, decades of efforts have been devoted to speech separation. There have been many methods proposed and a lot of success achieved. Especially with the rise of deep learning, deep learning-based speech separation has been proposed and extensively studied, which has been shown considerable promise and become a main research line. So far, there have been many deep learning-based speech separation methods proposed. However, there is little systematic analysis and summary on the deep learning-based speech separation technology. We try to give a detail analysis and summary on the general procedures and components of speech separation in this regard. Moreover, we survey a wide range of supervised speech separation techniques from three aspects: 1) features, 2) targets, 3) models. And finally we give some views on its developments.

Original languageEnglish
Pages (from-to)819-833
Number of pages15
JournalZidonghua Xuebao/Acta Automatica Sinica
Volume42
Issue number6
DOIs
Publication statusPublished - 1 Jun 2016
Externally publishedYes

Keywords

  • Computational auditory scene analysis
  • Machine learning
  • Neural network
  • Speech separation

Fingerprint

Dive into the research topics of 'Deep learning based speech separation technology and its developments'. Together they form a unique fingerprint.

Cite this