Cross Languages One-Versus-All Speech Emotion Classifier

Xiangrui Liu, Junchi Bin, Huakang Li*

*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

Abstract

Speech emotion recognition (SER) is a task that cannot be accomplished solely depending on linguistic models due to the presence of figures of speech. For a more accurate prediction of emotions, researchers adopted acoustic modelling. The complexity of SER can be attributed to a variety of acoustic features, the similarities among certain emotions, etc. In this paper, we proposed a framework named Cross Languages One-Versus-All Speech Emotion Classifier (CLOVASEC) that identifies speeches’ emotions for both Chinese and English. Acoustic features were preprocessed by Synthetic Minority Oversampling Technique (SMOTE) to diminish the impact of an imbalanced dataset then by Principal component analysis (PCA) to reduce the dimension. The features were fed into a classifier that was made up of eight sub-classifiers and each sub-classifier was tasked to differentiate one class from the other seven classes. The framework outperformed regular classifiers significantly on The Chinese Natural Audio-Visual Emotion Database (CHEAVD) and an English dataset from Deng.

Original languageEnglish
Title of host publicationNeural Computing for Advanced Applications - Second International Conference, NCAA 2021, Proceedings
EditorsHaijun Zhang, Zhi Yang, Zhao Zhang, Zhou Wu, Tianyong Hao
PublisherSpringer Science and Business Media Deutschland GmbH
Pages197-210
Number of pages14
ISBN (Print)9789811651878
DOIs
Publication statusPublished - 2021
Event2nd International Conference on Neural Computing for Advanced Applications, NCAA 2021 - Guangzhou, China
Duration: 27 Aug 202130 Aug 2021

Publication series

NameCommunications in Computer and Information Science
Volume1449
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Conference

Conference2nd International Conference on Neural Computing for Advanced Applications, NCAA 2021
Country/TerritoryChina
CityGuangzhou
Period27/08/2130/08/21

Keywords

  • Acoustic modelling
  • Deep learning
  • Multi-languages
  • Multiplicative attention
  • Speech emotion recognition

Fingerprint

Dive into the research topics of 'Cross Languages One-Versus-All Speech Emotion Classifier'. Together they form a unique fingerprint.

Cite this