TY - JOUR
T1 - Hybrid channel based pedestrian detection
AU - Tesema, Fiseha B.
AU - Wu, Hong
AU - Chen, Mingjian
AU - Lin, Junpeng
AU - Zhu, William
AU - Huang, Kaizhu
N1 - Funding Information:
We greatly acknowledge support by the National Natural Science Foundation of China under Grant Nos. 61772120 and 61876155 .
Publisher Copyright:
© 2020 Elsevier B.V.
PY - 2020/5/14
Y1 - 2020/5/14
N2 - Pedestrian detection has achieved great improvements with the help of Convolutional Neural Networks (CNNs). CNN can learn high-level features from input images, but the insufficient spatial resolution of CNN feature channels (feature maps) may cause a loss of information, which is harmful especially to small instances. In this paper, we propose a new pedestrian detection framework, which extends the successful RPN+BF framework to combine handcrafted features and CNN features. RoI-pooling is used to extract features from both handcrafted channels (e.g. HOG+LUV, CheckerBoards or RotatedFilters) and CNN channels. Since handcrafted channels always have higher spatial resolution than CNN channels, we apply RoI-pooling with larger output resolution to handcrafted channels to keep more detailed information. Our ablation experiments show that the developed handcrafted features can reach better detection accuracy than the CNN features extracted from the VGG-16 net, and a performance gain can be achieved by combining them. Experimental results on Caltech pedestrian dataset with the original annotations and the improved annotations demonstrate the effectiveness of the proposed approach. When using a more advanced RPN in our framework, our approach can be further improved and get competitive results on both benchmarks.
AB - Pedestrian detection has achieved great improvements with the help of Convolutional Neural Networks (CNNs). CNN can learn high-level features from input images, but the insufficient spatial resolution of CNN feature channels (feature maps) may cause a loss of information, which is harmful especially to small instances. In this paper, we propose a new pedestrian detection framework, which extends the successful RPN+BF framework to combine handcrafted features and CNN features. RoI-pooling is used to extract features from both handcrafted channels (e.g. HOG+LUV, CheckerBoards or RotatedFilters) and CNN channels. Since handcrafted channels always have higher spatial resolution than CNN channels, we apply RoI-pooling with larger output resolution to handcrafted channels to keep more detailed information. Our ablation experiments show that the developed handcrafted features can reach better detection accuracy than the CNN features extracted from the VGG-16 net, and a performance gain can be achieved by combining them. Experimental results on Caltech pedestrian dataset with the original annotations and the improved annotations demonstrate the effectiveness of the proposed approach. When using a more advanced RPN in our framework, our approach can be further improved and get competitive results on both benchmarks.
KW - CNN feature channels
KW - Feature combination
KW - Handcrafted features channels
KW - Pedestrian detection
KW - RoI-pooling
UR - http://www.scopus.com/inward/record.url?scp=85078719493&partnerID=8YFLogxK
U2 - 10.1016/j.neucom.2019.12.110
DO - 10.1016/j.neucom.2019.12.110
M3 - Article
AN - SCOPUS:85078719493
SN - 0925-2312
VL - 389
SP - 1
EP - 8
JO - Neurocomputing
JF - Neurocomputing
ER -