HAREDNet: A deep learning based architecture for autonomous video surveillance by recognizing human actions

Inzamam Mashood Nasir*, Mudassar Raza, Jamal Hussain Shah, Shui Hua Wang, Usman Tariq, Muhammad Attique Khan

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

21 Citations (Scopus)


Human Action Recognition (HAR) is still considered as a significant research area due to its emerging real-time applications like video surveillance, automated surveillance, real-time tracking and resecue missions. HAR domain still have gaps to cover, i.e., random changes in human variations, clothes, illumination, and backgrounds. Different camera settings, viewpoints and inter-class similarities have increased the complexity of this domain. The above-mentioned challenges in uncontrolled environment have ultimately reduced the performances of many well-designed models. The primary objective of this research is to propose and design an automated recognition system by overcoming these afore-mentioned issues. Redundant features and excessive computational time for the training and prediction process has also been a noteworthy problem. In this article, a hybrid recognition technique called HAREDNet is proposed, which has a) Encoder-Decoder Network (EDNet) to extract deep features; b) improved Scale-Invariant Feature Transform (iSIFT), improved Gabor (iGabor) and Local Maximal Occurrence (LOMO) techniques to extract local features; c) Cross-view Quadratic Discriminant Analysis (CvQDA) algorithm to reduce the feature redundancy; and d) weighted fusion strategy to merge properties of different essential features. The proposed technique is evaluated on three (3) publicly available datasets, including NTU RGB+D, HMDB51, and UCF-101, and achieved average recognition accuracy of 97.45%, 80.58%, and 97.48%, respectively, which is better than previously proposed methods.

Original languageEnglish
Article number107805
JournalComputers and Electrical Engineering
Publication statusPublished - Apr 2022
Externally publishedYes


  • CvQDA
  • Deep Convolutional Neural Network
  • Encoder-Decoder CNN architecture
  • Human Action Recognition
  • Weighted fusion


Dive into the research topics of 'HAREDNet: A deep learning based architecture for autonomous video surveillance by recognizing human actions'. Together they form a unique fingerprint.

Cite this