Two-Stage Multi-Target Joint Learning for Monaural Speech Separation

Shuai Nie, Shan Liang, Wei Xue, Xueliang Zhang, Wenju Liu, Like Dong, Hong Yang

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

7 Citations (Scopus)

Abstract

Recently, supervised speech separation has been extensively studied and shown considerable promise. Due to the temporal continuity of speech, speech auditory features and separation targets present prominent spectro-temporal structures and strong correlations over the time-frequency (T-F) domain, which can be exploited for speech separation. However, many supervised speech separation methods independently model each T-F unit with only one target and much ignore these useful information. In this paper, we propose a two-stage multi-target joint learning method to jointly model the related speech separation targets at the frame level. Systematic experiments show that the proposed approach consistently achieves better separation and generalization performances in the low signal-to-noise ratio(SNR) conditions.

Original languageEnglish
Title of host publication16th Annual Conference of the International Speech Communication Association, INTERSPEECH 2015
Pages1503-1507
Number of pages5
Volume2015-January
Publication statusPublished - 2015
Externally publishedYes
Event16th Annual Conference of the International Speech Communication Association, INTERSPEECH 2015 - Dresden, Germany
Duration: 6 Sept 201510 Sept 2015

Publication series

NameProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
ISSN (Print)2308-457X

Conference

Conference16th Annual Conference of the International Speech Communication Association, INTERSPEECH 2015
Country/TerritoryGermany
CityDresden
Period6/09/1510/09/15

Keywords

  • Computational auditory scene analysis (CASA)
  • Multi-target learning
  • Speech separation

Fingerprint

Dive into the research topics of 'Two-Stage Multi-Target Joint Learning for Monaural Speech Separation'. Together they form a unique fingerprint.

Cite this