TY - GEN
T1 - Generalisation in Environmental Sound Classification
T2 - 44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019
AU - Kroos, Christian
AU - Bones, Oliver
AU - Cao, Yin
AU - Harris, Lara
AU - Jackson, Philip J.B.
AU - Davies, William J.
AU - Wang, Wenwu
AU - Cox, Trevor J.
AU - Plumbley, Mark D.
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/5
Y1 - 2019/5
N2 - Humans are able to identify a large number of environmental sounds and categorise them according to high-level semantic categories, e.g. urban sounds or music. They are also capable of generalising from past experience to new sounds when applying these categories. In this paper we report on the creation of a data set that is structured according to the top-level of a taxonomy derived from human judgements and the design of an associated machine learning challenge, in which strong generalisation abilities are required to be successful. We introduce a baseline classification system, a deep convolutional network, which showed strong performance with an average accuracy on the evaluation data of 80.8%. The result is discussed in the light of two alternative explanations: An unlikely accidental category bias in the sound recordings or a more plausible true acoustic grounding of the high-level categories.
AB - Humans are able to identify a large number of environmental sounds and categorise them according to high-level semantic categories, e.g. urban sounds or music. They are also capable of generalising from past experience to new sounds when applying these categories. In this paper we report on the creation of a data set that is structured according to the top-level of a taxonomy derived from human judgements and the design of an associated machine learning challenge, in which strong generalisation abilities are required to be successful. We introduce a baseline classification system, a deep convolutional network, which showed strong performance with an average accuracy on the evaluation data of 80.8%. The result is discussed in the light of two alternative explanations: An unlikely accidental category bias in the sound recordings or a more plausible true acoustic grounding of the high-level categories.
KW - Acoustic classification
KW - convolutional neural network
KW - deep learning
KW - machine learning challenge
KW - sound taxonomy
UR - http://www.scopus.com/inward/record.url?scp=85068989594&partnerID=8YFLogxK
U2 - 10.1109/ICASSP.2019.8683292
DO - 10.1109/ICASSP.2019.8683292
M3 - Conference Proceeding
AN - SCOPUS:85068989594
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 8082
EP - 8086
BT - 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 12 May 2019 through 17 May 2019
ER -