TY - JOUR
T1 - Energy Efficient Graph-Based Hybrid Learning for Speech Emotion Recognition on Humanoid Robot
AU - Wu, Haowen
AU - Xu, Hanyue
AU - Seng, Kah Phooi
AU - Chen, Jieli
AU - Ang, Li Minn
N1 - Publisher Copyright:
© 2024 by the authors.
PY - 2024/3
Y1 - 2024/3
N2 - This paper presents a novel deep graph-based learning technique for speech emotion recognition which has been specifically tailored for energy efficient deployment within humanoid robots. Our methodology represents a fusion of scalable graph representations, rooted in the foundational principles of graph signal processing theories. By delving into the utilization of cycle or line graphs as fundamental constituents shaping a robust Graph Convolution Network (GCN)-based architecture, we propose an approach which allows the capture of relationships between speech signals to decode intricate emotional patterns and responses. Our methodology is validated and benchmarked against established databases such as IEMOCAP and MSP-IMPROV. Our model outperforms standard GCNs and prevalent deep graph architectures, demonstrating performance levels that align with state-of-the-art methodologies. Notably, our model achieves this feat while significantly reducing the number of learnable parameters, thereby increasing computational efficiency and bolstering its suitability for resource-constrained environments. This proposed energy-efficient graph-based hybrid learning methodology is applied towards multimodal emotion recognition within humanoid robots. Its capacity to deliver competitive performance while streamlining computational complexity and energy efficiency represents a novel approach in evolving emotion recognition systems, catering to diverse real-world applications where precision in emotion recognition within humanoid robots stands as a pivotal requisite.
AB - This paper presents a novel deep graph-based learning technique for speech emotion recognition which has been specifically tailored for energy efficient deployment within humanoid robots. Our methodology represents a fusion of scalable graph representations, rooted in the foundational principles of graph signal processing theories. By delving into the utilization of cycle or line graphs as fundamental constituents shaping a robust Graph Convolution Network (GCN)-based architecture, we propose an approach which allows the capture of relationships between speech signals to decode intricate emotional patterns and responses. Our methodology is validated and benchmarked against established databases such as IEMOCAP and MSP-IMPROV. Our model outperforms standard GCNs and prevalent deep graph architectures, demonstrating performance levels that align with state-of-the-art methodologies. Notably, our model achieves this feat while significantly reducing the number of learnable parameters, thereby increasing computational efficiency and bolstering its suitability for resource-constrained environments. This proposed energy-efficient graph-based hybrid learning methodology is applied towards multimodal emotion recognition within humanoid robots. Its capacity to deliver competitive performance while streamlining computational complexity and energy efficiency represents a novel approach in evolving emotion recognition systems, catering to diverse real-world applications where precision in emotion recognition within humanoid robots stands as a pivotal requisite.
KW - energy efficient deep learning
KW - graph convolutional neural network
KW - humanoid robot
KW - speech emotion recognition
UR - http://www.scopus.com/inward/record.url?scp=85188730433&partnerID=8YFLogxK
U2 - 10.3390/electronics13061151
DO - 10.3390/electronics13061151
M3 - Article
AN - SCOPUS:85188730433
SN - 2079-9292
VL - 13
JO - Electronics (Switzerland)
JF - Electronics (Switzerland)
IS - 6
M1 - 1151
ER -