Pre-trained DenseNet-121 with Multilayer Perceptron for Acoustic Event Classification

Pooi Shiang Tan, Kian Ming Lim*, Cheah Heng Tan, Chin Poo Lee

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

5 Citations (Scopus)

Abstract

Acoustic event classification aims to classify the acoustic event into the correct classes, which is beneficial in surveillance, multimedia information retrieval, and smart cities. The main challenges of acoustic event classification are insufficient data to learn a good model and varying lengths of the acoustic input signal. In this paper, a deep learning architecture, namely: Pre-trained DenseNet-121 with Multilayer Perceptron is proposed in this work to classify the acoustic events into correct classes. To mitigate the data scarcity problem, two data augmentation techniques: time stretching and pitch shifting, are applied on training data to boost the number of training samples. Given the augmented acoustic signal, a frequency spectrogram technique is then employed to represent the acoustic event signal into a fixed-size image representation. The output of the spectrogram images are enriched with the information of the acoustic signal such as energy levels over time domain, frequency changes, signal strength, and amplitude. Subsequently, a pre-trained DenseNet-121 model is adopted as a transfer learning technique to extract significant features from the spectrogram image. In doing so, computation resources can be greatly reduced and improve the performance of the deep learning-based model. Three benchmark datasets: (1) Soundscapes1, (2) Soundscapes2, and (3) UrbanSound8K, are used to assess the performance of the proposed method. From the experimental results, the proposed Pre-trained DenseNet-121 with Multilayer Perceptron outperforms existing works on Soundscapes1, Soundscapes2, and UrbanSound8K datasets with the F1-scores of 80.7%, 87.3%, and 69.6%,

Original languageEnglish
Article numberIJCS_50_1_07
JournalIAENG International Journal of Computer Science
Volume50
Issue number1
Publication statusPublished - Mar 2023
Externally publishedYes

Keywords

  • acoustic event classification
  • DenseNet
  • frequency spectrogram
  • multilayer perceptron
  • pitch shifting
  • time stretching

Fingerprint

Dive into the research topics of 'Pre-trained DenseNet-121 with Multilayer Perceptron for Acoustic Event Classification'. Together they form a unique fingerprint.

Cite this