An improved two-stream 3D convolutional neural network for human action recognition

Jun Chen, Yuanping Xu, Chaolong Zhang, Zhijie Xu, Xiangxiang Meng, Jie Wang

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

15 Citations (Scopus)

Abstract

In order to obtain global contextual information precisely from videos with heavy camera motions and scene changes, this study proposes an improved spatiotemporal two-stream neural network architecture with a novel convolutional fusion layer. The three main improvements of this study are: 1) the Resnet-101 network has been integrated into the two streams of the target network independently; 2) two kinds of feature maps (i.e., the optical flow motion and RGB-channel information) obtained by the corresponding convolution layer of two streams respectively are superimposed on each other; 3) the temporal information is combined with the spatial information by the integrated three-dimensional (3D) convolutional neural network (CNN) to extract more latent information from the videos. The proposed approach was tested by using UCF-101 and HMDB51 benchmarking datasets and the experimental results show that the proposed two-stream 3D CNN model can gain substantial improvement on the recognition rate in video-based analysis.

Original languageEnglish
Title of host publicationICAC 2019 - 2019 25th IEEE International Conference on Automation and Computing
EditorsHui Yu
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781861376664
DOIs
Publication statusPublished - Sept 2019
Externally publishedYes
Event25th IEEE International Conference on Automation and Computing, ICAC 2019 - Lancaster, United Kingdom
Duration: 5 Sept 20197 Sept 2019

Publication series

NameICAC 2019 - 2019 25th IEEE International Conference on Automation and Computing

Conference

Conference25th IEEE International Conference on Automation and Computing, ICAC 2019
Country/TerritoryUnited Kingdom
CityLancaster
Period5/09/197/09/19

Keywords

  • Human Action Recognition
  • Optical Flow
  • Three-dimensional CNN
  • Two-stream CNN

Fingerprint

Dive into the research topics of 'An improved two-stream 3D convolutional neural network for human action recognition'. Together they form a unique fingerprint.

Cite this