TY - JOUR
T1 - Convolutional neural network with spatial pyramid pooling for hand gesture recognition
AU - Tan, Yong Soon
AU - Lim, Kian Ming
AU - Tee, Connie
AU - Lee, Chin Poo
AU - Low, Cheng Yaw
N1 - Publisher Copyright:
© 2020, Springer-Verlag London Ltd., part of Springer Nature.
PY - 2021/5
Y1 - 2021/5
N2 - Hand gesture provides a means for human to interact through a series of gestures. While hand gesture plays a significant role in human–computer interaction, it also breaks down the communication barrier and simplifies communication process between the general public and the hearing-impaired community. This paper outlines a convolutional neural network (CNN) integrated with spatial pyramid pooling (SPP), dubbed CNN–SPP, for vision-based hand gesture recognition. SPP is discerned mitigating the problem found in conventional pooling by having multi-level pooling stacked together to extend the features being fed into a fully connected layer. Provided with inputs of varying sizes, SPP also yields a fixed-length feature representation. Extensive experiments have been conducted to scrutinize the CNN–SPP performance on two well-known American sign language (ASL) datasets and one NUS hand gesture dataset. Our empirical results disclose that CNN–SPP prevails over other deep learning-driven instances.
AB - Hand gesture provides a means for human to interact through a series of gestures. While hand gesture plays a significant role in human–computer interaction, it also breaks down the communication barrier and simplifies communication process between the general public and the hearing-impaired community. This paper outlines a convolutional neural network (CNN) integrated with spatial pyramid pooling (SPP), dubbed CNN–SPP, for vision-based hand gesture recognition. SPP is discerned mitigating the problem found in conventional pooling by having multi-level pooling stacked together to extend the features being fed into a fully connected layer. Provided with inputs of varying sizes, SPP also yields a fixed-length feature representation. Extensive experiments have been conducted to scrutinize the CNN–SPP performance on two well-known American sign language (ASL) datasets and one NUS hand gesture dataset. Our empirical results disclose that CNN–SPP prevails over other deep learning-driven instances.
KW - Convolutional neural network (CNN)
KW - Hand gesture recognition
KW - Sign language recognition
KW - Spatial pyramid pooling (SPP)
UR - http://www.scopus.com/inward/record.url?scp=85091021465&partnerID=8YFLogxK
U2 - 10.1007/s00521-020-05337-0
DO - 10.1007/s00521-020-05337-0
M3 - Article
AN - SCOPUS:85091021465
SN - 0941-0643
VL - 33
SP - 5339
EP - 5351
JO - Neural Computing and Applications
JF - Neural Computing and Applications
IS - 10
ER -