Augmented Audio Data in Improving Speech Emotion Classification Tasks

Nusrat J. Shoumy*, Li Minn Ang, D. M.Motiur Rahaman, Tanveer Zia, Kah Phooi Seng, Sabira Khatun

*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

3 Citations (Scopus)

Abstract

To achieve high performance and classification accuracy, classification of emotions from audio or speech signals requires large quantities of data. Big datasets, however, are not always readily accessible. A good solution to this issue is to increase the data and augment it to construct a larger dataset for the classifier’s training. This paper proposes a unimodal approach that focuses on two main concepts: (1) augmenting speech signals to generate additional data samples; and (2) constructing classification models to identify emotion expressed through speech. In addition, three classifiers (Convolutional Neural Network (CNN), Naïve Bayes (NB) and K-Nearest Neighbor (kNN)) were further tested in order to decide which of the classifiers had the best results. We used augmented audio data from a dataset (SAVEE) in the proposed method to conduct training (50%), and testing (50%) was executed using the original data. The best performance of approximately 83% was found to be a mixture of augmentation strategies using the CNN classifier. Our proposed augmentation approach together with appropriate classification model enhances the efficiency of voice emotion recognition.

Original languageEnglish
Title of host publicationAdvances and Trends in Artificial Intelligence. From Theory to Practice - 34th International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2021, Proceedings
EditorsHamido Fujita, Ali Selamat, Jerry Chun-Wei Lin, Moonis Ali
PublisherSpringer Science and Business Media Deutschland GmbH
Pages360-365
Number of pages6
ISBN (Print)9783030794620
DOIs
Publication statusPublished - 2021
Externally publishedYes
Event34th International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2021 - Virtual, Online
Duration: 26 Jul 202129 Jul 2021

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12799 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference34th International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2021
CityVirtual, Online
Period26/07/2129/07/21

Keywords

  • Audio data
  • Data augmentation
  • Data classification
  • Emotion recognition
  • Neural Network

Cite this