Source Separation with Weakly Labelled Data: An Approach to Computational Auditory Scene Analysis

Qiuqiang Kong, Yuxuan Wang, Xuchen Song, Yin Cao, Wenwu Wang, Mark D. Plumbley

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

37 Citations (Scopus)

Abstract

Source separation is the task of separating an audio recording into individual sound sources. Source separation is fundamental for computational auditory scene analysis. Previous work on source separation has focused on separating particular sound classes such as speech and music. Much previous work requires mixtures and clean source pairs for training. In this work, we propose a source separation framework trained with weakly labelled data. Weakly labelled data only contains the tags of an audio clip, without the occurrence time of sound events. We first train a sound event detection system with AudioSet. The trained sound event detection system is used to detect segments that are most likely to contain a target sound event. Then a regression is learnt from a mixture of two randomly selected segments to a target segment conditioned on the audio tagging prediction of the target segment. Our proposed system can separate 527 kinds of sound classes from AudioSet within a single system. A U-Net is adopted for the separation system and achieves an average SDR of 5.67 dB over 527 sound classes in AudioSet.

Original languageEnglish
Title of host publication2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages101-105
Number of pages5
ISBN (Electronic)9781509066315
DOIs
Publication statusPublished - May 2020
Externally publishedYes
Event2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020 - Barcelona, Spain
Duration: 4 May 20208 May 2020

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2020-May
ISSN (Print)1520-6149

Conference

Conference2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020
Country/TerritorySpain
CityBarcelona
Period4/05/208/05/20

Keywords

  • AudioSet
  • Source separation
  • computational auditory scene analysis
  • weakly labelled data

Fingerprint

Dive into the research topics of 'Source Separation with Weakly Labelled Data: An Approach to Computational Auditory Scene Analysis'. Together they form a unique fingerprint.

Cite this