Fingerspelling Recognition by 12-Layer CNN with Stochastic Pooling

Yu Dong Zhang, Xianwei Jiang, Shui Hua Wang*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

4 Citations (Scopus)


Fingerspelling is a method of spelling words via hand movements. This study aims to propose a novel fingerspelling recognition system. We use 1320 fingerspelling images in our dataset. Our method is based on the convolutional neural network (CNN) model. We propose a 12-layer CNN as the backbone. Particularly, stochastic pooling (SP) is used to help solve the problems caused by max pooling or average pooling. In addition, an improved 20-way data augmentation method is proposed to circumvent overfitting. Our method is dubbed CNNSP. The results show that our CNNSP method achieved a micro-averaged F1 (MAF) score of 90.04 ± 0.82%. In contrast, the MAFs of l2-pooling, average pooling, and max pooling are 86.21 ± 1.12%, 87.54 ± 1.39%, and 89.07 ± 0.78%, respectively. Our CNNSP attains better results than eight state-of-the-art fingerspelling recognition methods. Besides, the SP is better than l2-pooling, average pooling, and max pooling.

Original languageEnglish
JournalMobile Networks and Applications
Publication statusAccepted/In press - 2022
Externally publishedYes


  • Convolutional neural network
  • Data augmentation
  • Deep learning
  • Fingerspelling recognition
  • Stochastic pooling


Dive into the research topics of 'Fingerspelling Recognition by 12-Layer CNN with Stochastic Pooling'. Together they form a unique fingerprint.

Cite this