A TRACK-WISE ENSEMBLE EVENT INDEPENDENT NETWORK FOR POLYPHONIC SOUND EVENT LOCALIZATION AND DETECTION

Jinbo Hu, Yin Cao, Ming Wu, Qiuqiang Kong, Feiran Yang, Mark D. Plumbley, Jun Yang

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

4 Citations (Scopus)

Abstract

Polyphonic sound event localization and detection (SELD) aims at detecting types of sound events with corresponding temporal activities and spatial locations. In this paper, a trackwise ensemble event independent network with a novel data augmentation method is proposed. The proposed model is based on our previous proposed Event-Independent Network V2 and is extended by conformer blocks and dense blocks. The track-wise ensemble model with track-wise output format is proposed to solve an ensemble model problem for track-wise output format that track permutation may occur among different models. The data augmentation approach contains several data augmentation chains, which are composed of random combinations of several data augmentation operations. The method also utilizes log-mel spectrograms, intensity vectors, and Spatial Cues-Augmented Log-Spectrogram (SALSA) for different models. We evaluate our proposed method in the Task of the L3DAS22 challenge and obtain the top ranking solution with a location-dependent F-score to be 0.699. Source code is released.

Original languageEnglish
Title of host publication2022 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages9196-9200
Number of pages5
ISBN (Electronic)9781665405409
DOIs
Publication statusPublished - 2022
Externally publishedYes
Event47th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022 - Virtual, Online, Singapore
Duration: 23 May 202227 May 2022

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2022-May
ISSN (Print)1520-6149

Conference

Conference47th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022
Country/TerritorySingapore
CityVirtual, Online
Period23/05/2227/05/22

Keywords

  • Sound event localization and detection
  • data augmentation chains
  • event-independent network
  • track-wise ensemble model

Fingerprint

Dive into the research topics of 'A TRACK-WISE ENSEMBLE EVENT INDEPENDENT NETWORK FOR POLYPHONIC SOUND EVENT LOCALIZATION AND DETECTION'. Together they form a unique fingerprint.

Cite this