Position and Orientation Aware One-Shot Learning for Medical Action Recognition From Signal Data

Leiyu Xie; Yuxing Yang; Zeyu Fu; Syed Mohsen Naqvi

doi:10.1109/TMM.2024.3521703

Position and Orientation Aware One-Shot Learning for Medical Action Recognition From Signal Data

Leiyu Xie^*, Yuxing Yang, Zeyu Fu, Syed Mohsen Naqvi

^*Corresponding author for this work

Research output: Contribution to journal › Article › peer-review

2 Citations (Scopus)

Abstract

In this article, we propose a position and orientation-aware one-shot learning framework for medical action recognition from signal data. The proposed framework comprises two stages and each stage includes signal-level image generation (SIG), cross-attention (CsA), and dynamic time warping (DTW) modules and the information fusion between the proposed privacy-preserved position and orientation features. The proposed SIG method aims to transform the raw skeleton data into privacy-preserved features for training. The CsA module is developed to guide the network in reducing medical action recognition bias and more focusing on important human body parts for each specific action, aimed at addressing similar medical action related issues. Moreover, the DTW module is employed to minimize temporal mismatching between instances and further improve model performance. Furthermore, the proposed privacy-preserved orientation-level features are utilized to assist the position-level features in both of the two stages for enhancing medical action recognition performance. Extensive experimental results on the widely-used and well-known NTU RGB+D 60, NTU RGB+D 120, and PKU-MMD datasets all demonstrate the effectiveness of the proposed method, which outperforms the other state-of-the-art methods with general dataset partitioning by 2.7%, 6.2% and 4.1%, respectively.

Original language	English
Pages (from-to)	1860-1873
Number of pages	14
Journal	IEEE Transactions on Multimedia
Volume	27
DOIs	https://doi.org/10.1109/TMM.2024.3521703
Publication status	Published - 2025
Externally published	Yes

Keywords

One-shot learning
attention mechanism
feature fusion
healthcare
medical action recognition

Access to Document

10.1109/TMM.2024.3521703

Cite this

@article{29987b647fe24f5bab021897444a5b22,

title = "Position and Orientation Aware One-Shot Learning for Medical Action Recognition From Signal Data",

abstract = "In this article, we propose a position and orientation-aware one-shot learning framework for medical action recognition from signal data. The proposed framework comprises two stages and each stage includes signal-level image generation (SIG), cross-attention (CsA), and dynamic time warping (DTW) modules and the information fusion between the proposed privacy-preserved position and orientation features. The proposed SIG method aims to transform the raw skeleton data into privacy-preserved features for training. The CsA module is developed to guide the network in reducing medical action recognition bias and more focusing on important human body parts for each specific action, aimed at addressing similar medical action related issues. Moreover, the DTW module is employed to minimize temporal mismatching between instances and further improve model performance. Furthermore, the proposed privacy-preserved orientation-level features are utilized to assist the position-level features in both of the two stages for enhancing medical action recognition performance. Extensive experimental results on the widely-used and well-known NTU RGB+D 60, NTU RGB+D 120, and PKU-MMD datasets all demonstrate the effectiveness of the proposed method, which outperforms the other state-of-the-art methods with general dataset partitioning by 2.7%, 6.2% and 4.1%, respectively.",

keywords = "One-shot learning, attention mechanism, feature fusion, healthcare, medical action recognition",

author = "Leiyu Xie and Yuxing Yang and Zeyu Fu and Naqvi, {Syed Mohsen}",

note = "Publisher Copyright: {\textcopyright} 1999-2012 IEEE.",

year = "2025",

doi = "10.1109/TMM.2024.3521703",

language = "English",

volume = "27",

pages = "1860--1873",

journal = "IEEE Transactions on Multimedia",

issn = "1520-9210",

}

TY - JOUR

T1 - Position and Orientation Aware One-Shot Learning for Medical Action Recognition From Signal Data

AU - Xie, Leiyu

AU - Yang, Yuxing

AU - Fu, Zeyu

AU - Naqvi, Syed Mohsen

PY - 2025

Y1 - 2025

N2 - In this article, we propose a position and orientation-aware one-shot learning framework for medical action recognition from signal data. The proposed framework comprises two stages and each stage includes signal-level image generation (SIG), cross-attention (CsA), and dynamic time warping (DTW) modules and the information fusion between the proposed privacy-preserved position and orientation features. The proposed SIG method aims to transform the raw skeleton data into privacy-preserved features for training. The CsA module is developed to guide the network in reducing medical action recognition bias and more focusing on important human body parts for each specific action, aimed at addressing similar medical action related issues. Moreover, the DTW module is employed to minimize temporal mismatching between instances and further improve model performance. Furthermore, the proposed privacy-preserved orientation-level features are utilized to assist the position-level features in both of the two stages for enhancing medical action recognition performance. Extensive experimental results on the widely-used and well-known NTU RGB+D 60, NTU RGB+D 120, and PKU-MMD datasets all demonstrate the effectiveness of the proposed method, which outperforms the other state-of-the-art methods with general dataset partitioning by 2.7%, 6.2% and 4.1%, respectively.

AB - In this article, we propose a position and orientation-aware one-shot learning framework for medical action recognition from signal data. The proposed framework comprises two stages and each stage includes signal-level image generation (SIG), cross-attention (CsA), and dynamic time warping (DTW) modules and the information fusion between the proposed privacy-preserved position and orientation features. The proposed SIG method aims to transform the raw skeleton data into privacy-preserved features for training. The CsA module is developed to guide the network in reducing medical action recognition bias and more focusing on important human body parts for each specific action, aimed at addressing similar medical action related issues. Moreover, the DTW module is employed to minimize temporal mismatching between instances and further improve model performance. Furthermore, the proposed privacy-preserved orientation-level features are utilized to assist the position-level features in both of the two stages for enhancing medical action recognition performance. Extensive experimental results on the widely-used and well-known NTU RGB+D 60, NTU RGB+D 120, and PKU-MMD datasets all demonstrate the effectiveness of the proposed method, which outperforms the other state-of-the-art methods with general dataset partitioning by 2.7%, 6.2% and 4.1%, respectively.

KW - One-shot learning

KW - attention mechanism

KW - feature fusion

KW - healthcare

KW - medical action recognition

UR - http://www.scopus.com/inward/record.url?scp=85213710861&partnerID=8YFLogxK

U2 - 10.1109/TMM.2024.3521703

DO - 10.1109/TMM.2024.3521703

M3 - Article

AN - SCOPUS:85213710861

SN - 1520-9210

VL - 27

SP - 1860

EP - 1873

JO - IEEE Transactions on Multimedia

JF - IEEE Transactions on Multimedia

ER -

Position and Orientation Aware One-Shot Learning for Medical Action Recognition From Signal Data

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this