Continuous Valence-Arousal Space Prediction and Recognition Based on Feature Fusion

Misbah Ayoub; Haiyang Zhang; Andrew Abel

doi:10.1109/ICIT58233.2024.10540915

Continuous Valence-Arousal Space Prediction and Recognition Based on Feature Fusion

Misbah Ayoub, Haiyang Zhang, Andrew Abel

University of Strathclyde

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

Abstract

In emotion recognition, the multimodal feature fusion approach for facial expression recognition is useful due to its versatility and adaptability. It leads to improved model performance by capturing information from different modalities. In this study, we employ feature-level fusion, integrating CNN and HOG features. To predict continuous valence and arousal values, we utilize a Feedforward neural network and Gradient Boosting. Performance evaluation is conducted using Mean Squared Error (MSE) and Root Mean Squared Error (RMSE). The paper presents experiments using the ADFES dataset, considering low, medium, and high intensities, as well as an augmented video dataset. The results shows that instead of relying on complex models, accuracy can be achieved by combining various types of features with appropriate hyperparameter settings and tuning. This approach is not only cost-effective in terms of computation but also robust and computationally efficient.

Original language	English
Title of host publication	ICIT 2024 - 2024 25th International Conference on Industrial Technology
Publisher	Institute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)	9798350340266
DOIs	https://doi.org/10.1109/ICIT58233.2024.10540915
Publication status	Published - 2024
Event	25th IEEE International Conference on Industrial Technology, ICIT 2024 - Bristol, United Kingdom Duration: 25 Mar 2024 → 27 Mar 2024

Publication series

Name	Proceedings of the IEEE International Conference on Industrial Technology
ISSN (Print)	2641-0184
ISSN (Electronic)	2643-2978

Conference

Conference	25th IEEE International Conference on Industrial Technology, ICIT 2024
Country/Territory	United Kingdom
City	Bristol
Period	25/03/24 → 27/03/24

Keywords

Emotion recognition
Feature Fusion
CNN-HOG
Valence-Arousal Space
CNN
HOG
Feature-Level-Fusion
Multimodal Fusion

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

10.1109/ICIT58233.2024.10540915

Cite this

Ayoub, M., Zhang, H., & Abel, A. (2024). Continuous Valence-Arousal Space Prediction and Recognition Based on Feature Fusion. In ICIT 2024 - 2024 25th International Conference on Industrial Technology (Proceedings of the IEEE International Conference on Industrial Technology). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICIT58233.2024.10540915

@inproceedings{4986bbead2b94571bc75050f4f63ea85,

title = "Continuous Valence-Arousal Space Prediction and Recognition Based on Feature Fusion",

abstract = "In emotion recognition, the multimodal feature fusion approach for facial expression recognition is useful due to its versatility and adaptability. It leads to improved model performance by capturing information from different modalities. In this study, we employ feature-level fusion, integrating CNN and HOG features. To predict continuous valence and arousal values, we utilize a Feedforward neural network and Gradient Boosting. Performance evaluation is conducted using Mean Squared Error (MSE) and Root Mean Squared Error (RMSE). The paper presents experiments using the ADFES dataset, considering low, medium, and high intensities, as well as an augmented video dataset. The results shows that instead of relying on complex models, accuracy can be achieved by combining various types of features with appropriate hyperparameter settings and tuning. This approach is not only cost-effective in terms of computation but also robust and computationally efficient.",

keywords = "Emotion recognition, Feature Fusion, CNN-HOG, Valence-Arousal Space, CNN, HOG, Feature-Level-Fusion, Multimodal Fusion",

author = "Misbah Ayoub and Haiyang Zhang and Andrew Abel",

note = "Publisher Copyright: {\textcopyright} 2024 IEEE.; 25th IEEE International Conference on Industrial Technology, ICIT 2024 ; Conference date: 25-03-2024 Through 27-03-2024",

year = "2024",

doi = "10.1109/ICIT58233.2024.10540915",

language = "English",

series = "Proceedings of the IEEE International Conference on Industrial Technology",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

booktitle = "ICIT 2024 - 2024 25th International Conference on Industrial Technology",

}

Ayoub, M , Zhang, H & Abel, A 2024, Continuous Valence-Arousal Space Prediction and Recognition Based on Feature Fusion. in ICIT 2024 - 2024 25th International Conference on Industrial Technology. Proceedings of the IEEE International Conference on Industrial Technology, Institute of Electrical and Electronics Engineers Inc., 25th IEEE International Conference on Industrial Technology, ICIT 2024, Bristol, United Kingdom, 25/03/24. https://doi.org/10.1109/ICIT58233.2024.10540915

Continuous Valence-Arousal Space Prediction and Recognition Based on Feature Fusion. / Ayoub, Misbah ; Zhang, Haiyang; Abel, Andrew.
ICIT 2024 - 2024 25th International Conference on Industrial Technology. Institute of Electrical and Electronics Engineers Inc., 2024. (Proceedings of the IEEE International Conference on Industrial Technology).

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

TY - GEN

T1 - Continuous Valence-Arousal Space Prediction and Recognition Based on Feature Fusion

AU - Ayoub, Misbah

AU - Zhang, Haiyang

AU - Abel, Andrew

PY - 2024

Y1 - 2024

N2 - In emotion recognition, the multimodal feature fusion approach for facial expression recognition is useful due to its versatility and adaptability. It leads to improved model performance by capturing information from different modalities. In this study, we employ feature-level fusion, integrating CNN and HOG features. To predict continuous valence and arousal values, we utilize a Feedforward neural network and Gradient Boosting. Performance evaluation is conducted using Mean Squared Error (MSE) and Root Mean Squared Error (RMSE). The paper presents experiments using the ADFES dataset, considering low, medium, and high intensities, as well as an augmented video dataset. The results shows that instead of relying on complex models, accuracy can be achieved by combining various types of features with appropriate hyperparameter settings and tuning. This approach is not only cost-effective in terms of computation but also robust and computationally efficient.

AB - In emotion recognition, the multimodal feature fusion approach for facial expression recognition is useful due to its versatility and adaptability. It leads to improved model performance by capturing information from different modalities. In this study, we employ feature-level fusion, integrating CNN and HOG features. To predict continuous valence and arousal values, we utilize a Feedforward neural network and Gradient Boosting. Performance evaluation is conducted using Mean Squared Error (MSE) and Root Mean Squared Error (RMSE). The paper presents experiments using the ADFES dataset, considering low, medium, and high intensities, as well as an augmented video dataset. The results shows that instead of relying on complex models, accuracy can be achieved by combining various types of features with appropriate hyperparameter settings and tuning. This approach is not only cost-effective in terms of computation but also robust and computationally efficient.

KW - Emotion recognition

KW - Feature Fusion

KW - CNN-HOG

KW - Valence-Arousal Space

KW - CNN

KW - HOG

KW - Feature-Level-Fusion

KW - Multimodal Fusion

UR - http://www.scopus.com/inward/record.url?scp=85195783577&partnerID=8YFLogxK

U2 - 10.1109/ICIT58233.2024.10540915

DO - 10.1109/ICIT58233.2024.10540915

M3 - Conference Proceeding

AN - SCOPUS:85195783577

T3 - Proceedings of the IEEE International Conference on Industrial Technology

BT - ICIT 2024 - 2024 25th International Conference on Industrial Technology

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 25th IEEE International Conference on Industrial Technology, ICIT 2024

Y2 - 25 March 2024 through 27 March 2024

ER -

Continuous Valence-Arousal Space Prediction and Recognition Based on Feature Fusion

Abstract

Publication series

Conference

Keywords

UN SDGs

Access to Document

Other files and links

Fingerprint

Cite this