Joint Optimization of Recurrent Networks Exploiting Source Auto-regression for Source Separation

Shuai Nie, Wei Xue, Shan Liang, Xueliang Zhang, Wenju Liu, Liwei Qiao, Jianping Li

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

4 Citations (Scopus)

Abstract

In music interferences condition, source separation is very difficult. In this paper, we propose a novel recurrent network exploiting the auto-regressions of speech and music interference for source separation. An auto-regression can capture the shortterm temporal dependencies in data to help the source separation. For the separation, we independently separate the magnitude spectra of speech and interference from the mixture spectra by including an extra masking layer in the recurrent network. Compared to directly evaluating the ideal mask, the extra masking layer relaxes the assumption of independence between speech and interference which is more suitable for the realworld environments. Using the separated spectra of speech and interference, we further explore a discriminative training objective and joint optimization framework for the proposed network, which incorporates the correlations and spectral dependencies of speech and interference into the separation. Systematic experiments show that the proposed model is competitive with the state-of-the-art method in singing-voice separations.

Original languageEnglish
Title of host publication16th Annual Conference of the International Speech Communication Association, INTERSPEECH 2015
Pages3307-3311
Number of pages5
Volume2015-January
DOIs
Publication statusPublished - 2015
Externally publishedYes
Event16th Annual Conference of the International Speech Communication Association, INTERSPEECH 2015 - Dresden, Germany
Duration: 6 Sept 201510 Sept 2015

Publication series

NameProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
ISSN (Print)2308-457X

Conference

Conference16th Annual Conference of the International Speech Communication Association, INTERSPEECH 2015
Country/TerritoryGermany
CityDresden
Period6/09/1510/09/15

Keywords

  • Autoregressive models
  • Deep recurrent neural networks
  • Discriminative training objective
  • Source separation

Fingerprint

Dive into the research topics of 'Joint Optimization of Recurrent Networks Exploiting Source Auto-regression for Source Separation'. Together they form a unique fingerprint.

Cite this