Deep convolutional neural network with mixup for environmental sound classification

Zhichao Zhang, Shugong Xu*, Shan Cao, Shunqing Zhang

*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

96 Citations (Scopus)

Abstract

Environmental sound classification (ESC) is an important and challenging problem. In contrast to speech, sound events have noise-like nature and may be produced by a wide variety of sources. In this paper, we propose to use a novel deep convolutional neural network for ESC tasks. Our network architecture uses stacked convolutional and pooling layers to extract high-level feature representations from spectrogram-like features. Furthermore, we apply mixup to ESC tasks and explore its impacts on classification performance and feature distribution. Experiments were conducted on UrbanSound8K, ESC-50 and ESC-10 datasets. Our experimental results demonstrated that our ESC system has achieved the state-of-the-art performance (83.7%) on UrbanSound8K and competitive performance on ESC-50 and ESC-10.

Original languageEnglish
Title of host publicationPattern Recognition and Computer Vision - First Chinese Conference, PRCV 2018, Proceedings
EditorsCheng-Lin Liu, Tieniu Tan, Jie Zhou, Jian-Huang Lai, Xilin Chen, Nanning Zheng, Hongbin Zha
PublisherSpringer Verlag
Pages356-367
Number of pages12
ISBN (Print)9783030033347
DOIs
Publication statusPublished - 2018
Externally publishedYes
Event1st Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2018 - Guangzhou, China
Duration: 23 Nov 201826 Nov 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11257 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference1st Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2018
Country/TerritoryChina
CityGuangzhou
Period23/11/1826/11/18

Keywords

  • Convolutional neural network
  • Environmental sound classification
  • Mixup

Fingerprint

Dive into the research topics of 'Deep convolutional neural network with mixup for environmental sound classification'. Together they form a unique fingerprint.

Cite this