TY - GEN
T1 - One-Shot Medical Action Recognition With A Cross-Attention Mechanism And Dynamic Time Warping
AU - Xie, Leiyu
AU - Yang, Yuxing
AU - Fu, Zeyu
AU - Naqvi, Syed Mohsen
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - In this paper, we address the classification of medical actions with only one single sample by developing a novel one-shot learning framework which contains both cross-attention and dynamic time warping (DTW) modules. To be concrete, we firstly transform the raw skeleton sequence into the signal-level image representation. We exploit a metric learning approach, which is the prototypical network for the proposed one-shot learning framework and choose the residual network (ResNet18) as the backbone which is widely used in recent years. Cross-attention is applied for guiding the network to focus on the more important joints from each specific action. The cross-attention mechanism that applies between the support and query set will be adapted for mining and matching the relationships with the human body. Furthermore, a DTW module is introduced to mitigate the temporal information mismatching issue between the actions from the support and query sets. The experimental results on the NTU RGB+D 120 dataset demonstrate the effectiveness of our proposed approach and the improved performance compared to the baseline approach.
AB - In this paper, we address the classification of medical actions with only one single sample by developing a novel one-shot learning framework which contains both cross-attention and dynamic time warping (DTW) modules. To be concrete, we firstly transform the raw skeleton sequence into the signal-level image representation. We exploit a metric learning approach, which is the prototypical network for the proposed one-shot learning framework and choose the residual network (ResNet18) as the backbone which is widely used in recent years. Cross-attention is applied for guiding the network to focus on the more important joints from each specific action. The cross-attention mechanism that applies between the support and query set will be adapted for mining and matching the relationships with the human body. Furthermore, a DTW module is introduced to mitigate the temporal information mismatching issue between the actions from the support and query sets. The experimental results on the NTU RGB+D 120 dataset demonstrate the effectiveness of our proposed approach and the improved performance compared to the baseline approach.
KW - crossattention
KW - healthcare
KW - medical action classification
KW - one-shot learning
KW - signal representation
UR - http://www.scopus.com/inward/record.url?scp=86000389417&partnerID=8YFLogxK
U2 - 10.1109/ICASSP49357.2023.10097186
DO - 10.1109/ICASSP49357.2023.10097186
M3 - Conference Proceeding
AN - SCOPUS:86000389417
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
BT - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023
Y2 - 4 June 2023 through 10 June 2023
ER -