Hybrid channel based pedestrian detection

Fiseha B. Tesema; Hong Wu; Mingjian Chen; Junpeng Lin; William Zhu; Kaizhu Huang

doi:10.1016/j.neucom.2019.12.110

Hybrid channel based pedestrian detection

Fiseha B. Tesema, Hong Wu^*, Mingjian Chen, Junpeng Lin, William Zhu, Kaizhu Huang

^*Corresponding author for this work

School of Advanced Technology

Research output: Contribution to journal › Article › peer-review

26 Citations (Scopus)

Abstract

Pedestrian detection has achieved great improvements with the help of Convolutional Neural Networks (CNNs). CNN can learn high-level features from input images, but the insufficient spatial resolution of CNN feature channels (feature maps) may cause a loss of information, which is harmful especially to small instances. In this paper, we propose a new pedestrian detection framework, which extends the successful RPN+BF framework to combine handcrafted features and CNN features. RoI-pooling is used to extract features from both handcrafted channels (e.g. HOG+LUV, CheckerBoards or RotatedFilters) and CNN channels. Since handcrafted channels always have higher spatial resolution than CNN channels, we apply RoI-pooling with larger output resolution to handcrafted channels to keep more detailed information. Our ablation experiments show that the developed handcrafted features can reach better detection accuracy than the CNN features extracted from the VGG-16 net, and a performance gain can be achieved by combining them. Experimental results on Caltech pedestrian dataset with the original annotations and the improved annotations demonstrate the effectiveness of the proposed approach. When using a more advanced RPN in our framework, our approach can be further improved and get competitive results on both benchmarks.

Original language	English
Pages (from-to)	1-8
Number of pages	8
Journal	Neurocomputing
Volume	389
DOIs	https://doi.org/10.1016/j.neucom.2019.12.110
Publication status	Published - 14 May 2020

Keywords

CNN feature channels
Feature combination
Handcrafted features channels
Pedestrian detection
RoI-pooling

Access to Document

10.1016/j.neucom.2019.12.110

Cite this

@article{64eb0a44ac79421bbde0603774a87d0f,

title = "Hybrid channel based pedestrian detection",

abstract = "Pedestrian detection has achieved great improvements with the help of Convolutional Neural Networks (CNNs). CNN can learn high-level features from input images, but the insufficient spatial resolution of CNN feature channels (feature maps) may cause a loss of information, which is harmful especially to small instances. In this paper, we propose a new pedestrian detection framework, which extends the successful RPN+BF framework to combine handcrafted features and CNN features. RoI-pooling is used to extract features from both handcrafted channels (e.g. HOG+LUV, CheckerBoards or RotatedFilters) and CNN channels. Since handcrafted channels always have higher spatial resolution than CNN channels, we apply RoI-pooling with larger output resolution to handcrafted channels to keep more detailed information. Our ablation experiments show that the developed handcrafted features can reach better detection accuracy than the CNN features extracted from the VGG-16 net, and a performance gain can be achieved by combining them. Experimental results on Caltech pedestrian dataset with the original annotations and the improved annotations demonstrate the effectiveness of the proposed approach. When using a more advanced RPN in our framework, our approach can be further improved and get competitive results on both benchmarks.",

keywords = "CNN feature channels, Feature combination, Handcrafted features channels, Pedestrian detection, RoI-pooling",

author = "Tesema, {Fiseha B.} and Hong Wu and Mingjian Chen and Junpeng Lin and William Zhu and Kaizhu Huang",

note = "Funding Information: We greatly acknowledge support by the National Natural Science Foundation of China under Grant Nos. 61772120 and 61876155 . Publisher Copyright: {\textcopyright} 2020 Elsevier B.V.",

year = "2020",

month = may,

day = "14",

doi = "10.1016/j.neucom.2019.12.110",

language = "English",

volume = "389",

pages = "1--8",

journal = "Neurocomputing",

issn = "0925-2312",

}

TY - JOUR

T1 - Hybrid channel based pedestrian detection

AU - Tesema, Fiseha B.

AU - Wu, Hong

AU - Chen, Mingjian

AU - Lin, Junpeng

AU - Zhu, William

AU - Huang, Kaizhu

PY - 2020/5/14

Y1 - 2020/5/14

N2 - Pedestrian detection has achieved great improvements with the help of Convolutional Neural Networks (CNNs). CNN can learn high-level features from input images, but the insufficient spatial resolution of CNN feature channels (feature maps) may cause a loss of information, which is harmful especially to small instances. In this paper, we propose a new pedestrian detection framework, which extends the successful RPN+BF framework to combine handcrafted features and CNN features. RoI-pooling is used to extract features from both handcrafted channels (e.g. HOG+LUV, CheckerBoards or RotatedFilters) and CNN channels. Since handcrafted channels always have higher spatial resolution than CNN channels, we apply RoI-pooling with larger output resolution to handcrafted channels to keep more detailed information. Our ablation experiments show that the developed handcrafted features can reach better detection accuracy than the CNN features extracted from the VGG-16 net, and a performance gain can be achieved by combining them. Experimental results on Caltech pedestrian dataset with the original annotations and the improved annotations demonstrate the effectiveness of the proposed approach. When using a more advanced RPN in our framework, our approach can be further improved and get competitive results on both benchmarks.

AB - Pedestrian detection has achieved great improvements with the help of Convolutional Neural Networks (CNNs). CNN can learn high-level features from input images, but the insufficient spatial resolution of CNN feature channels (feature maps) may cause a loss of information, which is harmful especially to small instances. In this paper, we propose a new pedestrian detection framework, which extends the successful RPN+BF framework to combine handcrafted features and CNN features. RoI-pooling is used to extract features from both handcrafted channels (e.g. HOG+LUV, CheckerBoards or RotatedFilters) and CNN channels. Since handcrafted channels always have higher spatial resolution than CNN channels, we apply RoI-pooling with larger output resolution to handcrafted channels to keep more detailed information. Our ablation experiments show that the developed handcrafted features can reach better detection accuracy than the CNN features extracted from the VGG-16 net, and a performance gain can be achieved by combining them. Experimental results on Caltech pedestrian dataset with the original annotations and the improved annotations demonstrate the effectiveness of the proposed approach. When using a more advanced RPN in our framework, our approach can be further improved and get competitive results on both benchmarks.

KW - CNN feature channels

KW - Feature combination

KW - Handcrafted features channels

KW - Pedestrian detection

KW - RoI-pooling

UR - http://www.scopus.com/inward/record.url?scp=85078719493&partnerID=8YFLogxK

U2 - 10.1016/j.neucom.2019.12.110

DO - 10.1016/j.neucom.2019.12.110

M3 - Article

AN - SCOPUS:85078719493

SN - 0925-2312

VL - 389

SP - 1

EP - 8

JO - Neurocomputing

JF - Neurocomputing

ER -

Hybrid channel based pedestrian detection

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this