Abstract
Polyphonic sound event localization and detection (SELD), which jointly performs sound event detection (SED) and direction-of-arrival (DoA) estimation, detects the type and occurrence time of sound events as well as their corresponding DoA angles simultaneously. We study the SELD task from a multi-task learning perspective. Two open problems are addressed in this paper. Firstly, to detect overlapping sound events of the same type but with different DoAs, we propose to use a trackwise output format and solve the accompanying track permutation problem with permutation-invariant training. Multi-head self-attention is further used to separate tracks. Secondly, a previous finding is that, by using hard parameter-sharing, SELD suffers from a performance loss compared with learning the subtasks separately. This is solved by a soft parameter-sharing scheme. We term the proposed method as Event Independent Network V2 (EINV2), which is an improved version of our previously-proposed method and an end-to-end network for SELD. We show that our proposed EINV2 for joint SED and DoA estimation outperforms previous methods by a large margin, and has comparable performance to state-of-the-art ensemble models.
| Original language | English |
|---|---|
| Pages (from-to) | 885-889 |
| Number of pages | 5 |
| Journal | ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings |
| Volume | 2021-June |
| DOIs | |
| Publication status | Published - 2021 |
| Externally published | Yes |
| Event | 2021 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2021 - Virtual, Toronto, Canada Duration: 6 Jun 2021 → 11 Jun 2021 |
Keywords
- Direction of arrival
- Event-independent
- Multitask learning
- Permutation-invariant training
- Sound event localization and detection
Fingerprint
Dive into the research topics of 'An improved event-independent network for polyphonic sound event localization and detection'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver