Deep learning based speech separation technology and its developments

Wen Ju Liu; Shuai Nie; Shan Liang; Xue Liang Zhang

doi:10.16383/j.aas.2016.c150734

Deep learning based speech separation technology and its developments

Wen Ju Liu^*, Shuai Nie, Shan Liang, Xue Liang Zhang

^*Corresponding author for this work

Research output: Contribution to journal › Review article › peer-review

38 Citations (Scopus)

Abstract

Nowadays, speech interaction technology has been widely used in our daily life. However, due to the interfer- ences, the performances of speech interaction systems in real-world environments are far from being satisfactory. Speech separation technology has been proven to be an effective way to improve the performance of speech interaction in noisy environments. To this end, decades of efforts have been devoted to speech separation. There have been many methods proposed and a lot of success achieved. Especially with the rise of deep learning, deep learning-based speech separation has been proposed and extensively studied, which has been shown considerable promise and become a main research line. So far, there have been many deep learning-based speech separation methods proposed. However, there is little systematic analysis and summary on the deep learning-based speech separation technology. We try to give a detail analysis and summary on the general procedures and components of speech separation in this regard. Moreover, we survey a wide range of supervised speech separation techniques from three aspects: 1) features, 2) targets, 3) models. And finally we give some views on its developments.

Original language	English
Pages (from-to)	819-833
Number of pages	15
Journal	Zidonghua Xuebao/Acta Automatica Sinica
Volume	42
Issue number	6
DOIs	https://doi.org/10.16383/j.aas.2016.c150734
Publication status	Published - 1 Jun 2016
Externally published	Yes

Keywords

Computational auditory scene analysis
Machine learning
Neural network
Speech separation

Access to Document

10.16383/j.aas.2016.c150734

Cite this

@article{da05740f10a44c9681efe5205aa05cf2,

title = "Deep learning based speech separation technology and its developments",

abstract = "Nowadays, speech interaction technology has been widely used in our daily life. However, due to the interfer- ences, the performances of speech interaction systems in real-world environments are far from being satisfactory. Speech separation technology has been proven to be an effective way to improve the performance of speech interaction in noisy environments. To this end, decades of efforts have been devoted to speech separation. There have been many methods proposed and a lot of success achieved. Especially with the rise of deep learning, deep learning-based speech separation has been proposed and extensively studied, which has been shown considerable promise and become a main research line. So far, there have been many deep learning-based speech separation methods proposed. However, there is little systematic analysis and summary on the deep learning-based speech separation technology. We try to give a detail analysis and summary on the general procedures and components of speech separation in this regard. Moreover, we survey a wide range of supervised speech separation techniques from three aspects: 1) features, 2) targets, 3) models. And finally we give some views on its developments.",

keywords = "Computational auditory scene analysis, Machine learning, Neural network, Speech separation",

author = "Liu, {Wen Ju} and Shuai Nie and Shan Liang and Zhang, {Xue Liang}",

year = "2016",

month = jun,

day = "1",

doi = "10.16383/j.aas.2016.c150734",

language = "English",

volume = "42",

pages = "819--833",

journal = "Zidonghua Xuebao/Acta Automatica Sinica",

issn = "0254-4156",

number = "6",

}

TY - JOUR

T1 - Deep learning based speech separation technology and its developments

AU - Liu, Wen Ju

AU - Nie, Shuai

AU - Liang, Shan

AU - Zhang, Xue Liang

PY - 2016/6/1

Y1 - 2016/6/1

N2 - Nowadays, speech interaction technology has been widely used in our daily life. However, due to the interfer- ences, the performances of speech interaction systems in real-world environments are far from being satisfactory. Speech separation technology has been proven to be an effective way to improve the performance of speech interaction in noisy environments. To this end, decades of efforts have been devoted to speech separation. There have been many methods proposed and a lot of success achieved. Especially with the rise of deep learning, deep learning-based speech separation has been proposed and extensively studied, which has been shown considerable promise and become a main research line. So far, there have been many deep learning-based speech separation methods proposed. However, there is little systematic analysis and summary on the deep learning-based speech separation technology. We try to give a detail analysis and summary on the general procedures and components of speech separation in this regard. Moreover, we survey a wide range of supervised speech separation techniques from three aspects: 1) features, 2) targets, 3) models. And finally we give some views on its developments.

AB - Nowadays, speech interaction technology has been widely used in our daily life. However, due to the interfer- ences, the performances of speech interaction systems in real-world environments are far from being satisfactory. Speech separation technology has been proven to be an effective way to improve the performance of speech interaction in noisy environments. To this end, decades of efforts have been devoted to speech separation. There have been many methods proposed and a lot of success achieved. Especially with the rise of deep learning, deep learning-based speech separation has been proposed and extensively studied, which has been shown considerable promise and become a main research line. So far, there have been many deep learning-based speech separation methods proposed. However, there is little systematic analysis and summary on the deep learning-based speech separation technology. We try to give a detail analysis and summary on the general procedures and components of speech separation in this regard. Moreover, we survey a wide range of supervised speech separation techniques from three aspects: 1) features, 2) targets, 3) models. And finally we give some views on its developments.

KW - Computational auditory scene analysis

KW - Machine learning

KW - Neural network

KW - Speech separation

UR - http://www.scopus.com/inward/record.url?scp=84977275703&partnerID=8YFLogxK

U2 - 10.16383/j.aas.2016.c150734

DO - 10.16383/j.aas.2016.c150734

M3 - Review article

AN - SCOPUS:84977275703

SN - 0254-4156

VL - 42

SP - 819

EP - 833

JO - Zidonghua Xuebao/Acta Automatica Sinica

JF - Zidonghua Xuebao/Acta Automatica Sinica

IS - 6

ER -

Deep learning based speech separation technology and its developments

Abstract

Keywords

Access to Document

Other files and links

Cite this