Position and Orientation Aware One-Shot Learning for Medical Action Recognition from Signal Data

Leiyu Xie; Yuxing Yang; Zeyu Fu; Syed Mohsen Naqvi

doi:10.1109/TMM.2024.3521703

Position and Orientation Aware One-Shot Learning for Medical Action Recognition from Signal Data

Leiyu Xie^*, Yuxing Yang, Zeyu Fu, Syed Mohsen Naqvi

^*Corresponding author for this work

Research output: Contribution to journal › Article › peer-review

Abstract

In this work, we propose a position and orientationaware one-shot learning framework for medical action recognition from signal data. The proposed framework comprises two stages and each stage includes signal-level image generation (SIG), cross-attention (CsA), and dynamic time warping (DTW) modules and the information fusion between the proposed privacy-preserved position and orientation features. The proposed SIG method aims to transform the raw skeleton data into privacy-preserved features for training. The CsA module is developed to guide the network in reducing medical action recognition bias and more focusing on important human body parts for each specific action, aimed at addressing similar medical action related issues. Moreover, the DTW module is employed to minimize temporal mismatching between instances and further improve model performance. Furthermore, the proposed privacypreserved orientation-level features are utilized to assist the position-level features in both of the two stages for enhancing medical action recognition performance. Extensive experimental results on the widely-used and well-known NTU RGB+D 60, NTU RGB+D 120, and PKU-MMD datasets all demonstrate the effectiveness of the proposed method, which outperforms the other state-of-the-art methods with general dataset partitioning by 2.7%, 6.2% and 4.1%, respectively.

Original language	English
Journal	IEEE Transactions on Multimedia
DOIs	https://doi.org/10.1109/TMM.2024.3521703
Publication status	Accepted/In press - 2024
Externally published	Yes

Keywords

attention mechanism
feature fusion
healthcare
medical action recognition
One-shot learning

Access to Document

10.1109/TMM.2024.3521703

Cite this

@article{29987b647fe24f5bab021897444a5b22,

title = "Position and Orientation Aware One-Shot Learning for Medical Action Recognition from Signal Data",

abstract = "In this work, we propose a position and orientationaware one-shot learning framework for medical action recognition from signal data. The proposed framework comprises two stages and each stage includes signal-level image generation (SIG), cross-attention (CsA), and dynamic time warping (DTW) modules and the information fusion between the proposed privacy-preserved position and orientation features. The proposed SIG method aims to transform the raw skeleton data into privacy-preserved features for training. The CsA module is developed to guide the network in reducing medical action recognition bias and more focusing on important human body parts for each specific action, aimed at addressing similar medical action related issues. Moreover, the DTW module is employed to minimize temporal mismatching between instances and further improve model performance. Furthermore, the proposed privacypreserved orientation-level features are utilized to assist the position-level features in both of the two stages for enhancing medical action recognition performance. Extensive experimental results on the widely-used and well-known NTU RGB+D 60, NTU RGB+D 120, and PKU-MMD datasets all demonstrate the effectiveness of the proposed method, which outperforms the other state-of-the-art methods with general dataset partitioning by 2.7%, 6.2% and 4.1%, respectively.",

keywords = "attention mechanism, feature fusion, healthcare, medical action recognition, One-shot learning",

author = "Leiyu Xie and Yuxing Yang and Zeyu Fu and Naqvi, {Syed Mohsen}",

note = "Publisher Copyright: {\textcopyright} 1999-2012 IEEE.",

year = "2024",

doi = "10.1109/TMM.2024.3521703",

language = "English",

journal = "IEEE Transactions on Multimedia",

issn = "1520-9210",

}

TY - JOUR

T1 - Position and Orientation Aware One-Shot Learning for Medical Action Recognition from Signal Data

AU - Xie, Leiyu

AU - Yang, Yuxing

AU - Fu, Zeyu

AU - Naqvi, Syed Mohsen

PY - 2024

Y1 - 2024

N2 - In this work, we propose a position and orientationaware one-shot learning framework for medical action recognition from signal data. The proposed framework comprises two stages and each stage includes signal-level image generation (SIG), cross-attention (CsA), and dynamic time warping (DTW) modules and the information fusion between the proposed privacy-preserved position and orientation features. The proposed SIG method aims to transform the raw skeleton data into privacy-preserved features for training. The CsA module is developed to guide the network in reducing medical action recognition bias and more focusing on important human body parts for each specific action, aimed at addressing similar medical action related issues. Moreover, the DTW module is employed to minimize temporal mismatching between instances and further improve model performance. Furthermore, the proposed privacypreserved orientation-level features are utilized to assist the position-level features in both of the two stages for enhancing medical action recognition performance. Extensive experimental results on the widely-used and well-known NTU RGB+D 60, NTU RGB+D 120, and PKU-MMD datasets all demonstrate the effectiveness of the proposed method, which outperforms the other state-of-the-art methods with general dataset partitioning by 2.7%, 6.2% and 4.1%, respectively.

AB - In this work, we propose a position and orientationaware one-shot learning framework for medical action recognition from signal data. The proposed framework comprises two stages and each stage includes signal-level image generation (SIG), cross-attention (CsA), and dynamic time warping (DTW) modules and the information fusion between the proposed privacy-preserved position and orientation features. The proposed SIG method aims to transform the raw skeleton data into privacy-preserved features for training. The CsA module is developed to guide the network in reducing medical action recognition bias and more focusing on important human body parts for each specific action, aimed at addressing similar medical action related issues. Moreover, the DTW module is employed to minimize temporal mismatching between instances and further improve model performance. Furthermore, the proposed privacypreserved orientation-level features are utilized to assist the position-level features in both of the two stages for enhancing medical action recognition performance. Extensive experimental results on the widely-used and well-known NTU RGB+D 60, NTU RGB+D 120, and PKU-MMD datasets all demonstrate the effectiveness of the proposed method, which outperforms the other state-of-the-art methods with general dataset partitioning by 2.7%, 6.2% and 4.1%, respectively.

KW - attention mechanism

KW - feature fusion

KW - healthcare

KW - medical action recognition

KW - One-shot learning

UR - http://www.scopus.com/inward/record.url?scp=85213710861&partnerID=8YFLogxK

U2 - 10.1109/TMM.2024.3521703

DO - 10.1109/TMM.2024.3521703

M3 - Article

AN - SCOPUS:85213710861

SN - 1520-9210

JO - IEEE Transactions on Multimedia

JF - IEEE Transactions on Multimedia

ER -

Position and Orientation Aware One-Shot Learning for Medical Action Recognition from Signal Data

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this