TY - GEN
T1 - A MULTI-TASK LEARNING METHOD FOR WEAKLY SUPERVISED SOUND EVENT DETECTION
AU - Liu, Sichen
AU - Yang, Feiran
AU - Kang, Fang
AU - Yang, Jun
N1 - Publisher Copyright:
© 2022 IEEE
PY - 2022
Y1 - 2022
N2 - In weakly supervised sound event detection (SED), only coarse-grained labels are available, and thus the supervision information is quite limited. To fully utilize prior knowledge of the time-frequency masks of each sound event, we propose a novel multi-task learning (MTL) method that takes SED as the main task and source separation as the auxiliary task. For active events, we minimize the overlap of their masks as the segment loss to learn distinguishing features. For inactive events, the proposed method measures the activity of masks as silent loss to reduce the insertion error. The auxiliary source separation task calculates an extra penalty according to the shared masks, which can further incorporate prior knowledge in the form of regularization constraints. We demonstrated that the proposed method can effectively reduce the insertion error and achieve a better performance in SED task than single-task methods.
AB - In weakly supervised sound event detection (SED), only coarse-grained labels are available, and thus the supervision information is quite limited. To fully utilize prior knowledge of the time-frequency masks of each sound event, we propose a novel multi-task learning (MTL) method that takes SED as the main task and source separation as the auxiliary task. For active events, we minimize the overlap of their masks as the segment loss to learn distinguishing features. For inactive events, the proposed method measures the activity of masks as silent loss to reduce the insertion error. The auxiliary source separation task calculates an extra penalty according to the shared masks, which can further incorporate prior knowledge in the form of regularization constraints. We demonstrated that the proposed method can effectively reduce the insertion error and achieve a better performance in SED task than single-task methods.
KW - multi-task learning (MTL)
KW - Sound event detection (SED)
KW - source separation (SS)
KW - weakly supervised
UR - http://www.scopus.com/inward/record.url?scp=85131229312&partnerID=8YFLogxK
U2 - 10.1109/ICASSP43922.2022.9746947
DO - 10.1109/ICASSP43922.2022.9746947
M3 - Conference Proceeding
AN - SCOPUS:85131229312
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 8802
EP - 8806
BT - 2022 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 47th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022
Y2 - 23 May 2022 through 27 May 2022
ER -