TY - JOUR
T1 - Micro-Expression Recognition Using Convolutional Variational Attention Transformer (ConVAT) with Multihead Attention Mechanism
AU - Khizer Bin Talib, Hafiz
AU - Xu, Kaiwei
AU - Cao, Yanlong
AU - Ping Xu, Yuan
AU - Xu, Zhijie
AU - Zaman, Muhammad
AU - Akhunzada, Adnan
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2025
Y1 - 2025
N2 - Micro-Expression Recognition is crucial in various fields such as behavioral analysis, security, and psychological studies, offering valuable insights into subtle and often concealed emotional states. Despite significant advancements in deep learning models, challenges persist in accurately handling the nuanced and fleeting nature of micro-expressions, particularly when applied across diverse datasets with varied expressions. Existing models often struggle with precision and adaptability, leading to inconsistent recognition performance. To address these limitations, we propose the Convolutional Variational Attention Transformer (ConVAT), a novel model that leverages a multi-head attention mechanism integrated with convolutional networks, optimized specifically for detailed micro-expression analysis. Our methodology employs the Leave-One-Subject-Out (LOSO) cross-validation technique across three widely used datasets: SAMM, CASME II, and SMIC. The results demonstrate the effectiveness of ConVAT, achieving impressive performance with 98.73% accuracy on the SAMM dataset, 97.95% on the SMIC dataset, and 97.65% on CASME II. These outcomes not only surpass current state-of-the-art benchmarks but also highlight ConVAT's robustness and reliability in capturing micro-expressions, marking a significant advancement toward developing sophisticated automated systems for real-world applications in micro-expression recognition.
AB - Micro-Expression Recognition is crucial in various fields such as behavioral analysis, security, and psychological studies, offering valuable insights into subtle and often concealed emotional states. Despite significant advancements in deep learning models, challenges persist in accurately handling the nuanced and fleeting nature of micro-expressions, particularly when applied across diverse datasets with varied expressions. Existing models often struggle with precision and adaptability, leading to inconsistent recognition performance. To address these limitations, we propose the Convolutional Variational Attention Transformer (ConVAT), a novel model that leverages a multi-head attention mechanism integrated with convolutional networks, optimized specifically for detailed micro-expression analysis. Our methodology employs the Leave-One-Subject-Out (LOSO) cross-validation technique across three widely used datasets: SAMM, CASME II, and SMIC. The results demonstrate the effectiveness of ConVAT, achieving impressive performance with 98.73% accuracy on the SAMM dataset, 97.95% on the SMIC dataset, and 97.65% on CASME II. These outcomes not only surpass current state-of-the-art benchmarks but also highlight ConVAT's robustness and reliability in capturing micro-expressions, marking a significant advancement toward developing sophisticated automated systems for real-world applications in micro-expression recognition.
KW - ConVAT
KW - convolutional neural networks
KW - LOSO cross-validation
KW - micro-expression recognition
KW - multi-head attention
UR - http://www.scopus.com/inward/record.url?scp=85215962767&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2025.3530114
DO - 10.1109/ACCESS.2025.3530114
M3 - Article
AN - SCOPUS:85215962767
SN - 2169-3536
VL - 13
SP - 20054
EP - 20070
JO - IEEE Access
JF - IEEE Access
ER -