TY - JOUR
T1 - H2GCN
T2 - A hybrid hypergraph convolution network for skeleton-based action recognition
AU - Shao, Yiming
AU - Mao, Lintao
AU - Ye, Leixiong
AU - Li, Jincheng
AU - Yang, Ping
AU - Ji, Chengtao
AU - Wu, Zizhao
N1 - Publisher Copyright:
© 2024 The Authors
PY - 2024/6
Y1 - 2024/6
N2 - Recent GCN-based works have achieved remarkable results for skeleton-based human action recognition. Nevertheless, while existing approaches extensively investigate pairwise joint relationships, only a limited number of models explore the intricate, high-order relationships among multiple joints. In this paper, we propose a novel hypergraph convolution method that represents the relationships among multiple joints with hyperedges, and dynamically refines the height-order relationship between hyperedges in the spatial, temporal, and channel dimensions. Specifically, our method initiates with a temporal-channel refinement hypergraph convolutional network, dynamically learning temporal and channel topologies in a data-dependent manner, which facilitates the capture of non-physical structural information inherent in the human body. Furthermore, to model various inter-joint relationships across spatio-temporal dimensions, we propose a spatio-temporal hypergraph joint module, which aims to encapsulate the dynamic spatial–temporal characteristics of the human body. Through the integration of these modules, our proposed model achieves state-of-the-art performance on RGB+D 60 and NTU RGB+D 120 datasets.
AB - Recent GCN-based works have achieved remarkable results for skeleton-based human action recognition. Nevertheless, while existing approaches extensively investigate pairwise joint relationships, only a limited number of models explore the intricate, high-order relationships among multiple joints. In this paper, we propose a novel hypergraph convolution method that represents the relationships among multiple joints with hyperedges, and dynamically refines the height-order relationship between hyperedges in the spatial, temporal, and channel dimensions. Specifically, our method initiates with a temporal-channel refinement hypergraph convolutional network, dynamically learning temporal and channel topologies in a data-dependent manner, which facilitates the capture of non-physical structural information inherent in the human body. Furthermore, to model various inter-joint relationships across spatio-temporal dimensions, we propose a spatio-temporal hypergraph joint module, which aims to encapsulate the dynamic spatial–temporal characteristics of the human body. Through the integration of these modules, our proposed model achieves state-of-the-art performance on RGB+D 60 and NTU RGB+D 120 datasets.
KW - Action recognition
KW - Hypergraph convolution network
KW - Spatio-temporal modeling
UR - http://www.scopus.com/inward/record.url?scp=85194735049&partnerID=8YFLogxK
U2 - 10.1016/j.jksuci.2024.102072
DO - 10.1016/j.jksuci.2024.102072
M3 - Article
AN - SCOPUS:85194735049
SN - 1319-1578
VL - 36
JO - Journal of King Saud University - Computer and Information Sciences
JF - Journal of King Saud University - Computer and Information Sciences
IS - 5
M1 - 102072
ER -