TY - GEN
T1 - MPSSD
T2 - 2019 International Joint Conference on Neural Networks, IJCNN 2019
AU - Qu, Shuyi
AU - Huang, Kaizhu
AU - Hussain, Amir
AU - Goulermas, Yannis
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/7
Y1 - 2019/7
N2 - Recent prevalent one stage detectors, such as single shot detector (SSD) and RetinaNet, are able to detect objects faster than two stage ones while maintaining comparable accuracy. To further boost the accuracy, many studies focus on enhancing the multi-scale feature pyramid. Most of these current proposals focus on strengthening features on one pyramid, ignoring the rich connection among different scale features. In contrast, we propose a novel multi-path design to fully utilize the localization and semantics information. First, we exploit the original SSD multi-scale features as our base pyramid. Then we fuse these features in different groups to generate multi-path feature pyramids. Finally, we combine these pyramids through a novel and effective aggregation module, to obtain the final informative pyramid for detection. Comparative experiments on benchmark PASCAL VOC and MS COCO datasets have shown that our proposed method outperforms many state-of-the-art detectors. As an illustrative example, for input image with size 512×512, we can achieve a mean Average Precision (mAP) of 81.8% on VOC2007 test and 33.1% mAP on COCO test-dev2015.
AB - Recent prevalent one stage detectors, such as single shot detector (SSD) and RetinaNet, are able to detect objects faster than two stage ones while maintaining comparable accuracy. To further boost the accuracy, many studies focus on enhancing the multi-scale feature pyramid. Most of these current proposals focus on strengthening features on one pyramid, ignoring the rich connection among different scale features. In contrast, we propose a novel multi-path design to fully utilize the localization and semantics information. First, we exploit the original SSD multi-scale features as our base pyramid. Then we fuse these features in different groups to generate multi-path feature pyramids. Finally, we combine these pyramids through a novel and effective aggregation module, to obtain the final informative pyramid for detection. Comparative experiments on benchmark PASCAL VOC and MS COCO datasets have shown that our proposed method outperforms many state-of-the-art detectors. As an illustrative example, for input image with size 512×512, we can achieve a mean Average Precision (mAP) of 81.8% on VOC2007 test and 33.1% mAP on COCO test-dev2015.
KW - Fusion
KW - Multiple Path
KW - Object Detection
KW - SSD
UR - http://www.scopus.com/inward/record.url?scp=85073255773&partnerID=8YFLogxK
U2 - 10.1109/IJCNN.2019.8852053
DO - 10.1109/IJCNN.2019.8852053
M3 - Conference Proceeding
AN - SCOPUS:85073255773
T3 - Proceedings of the International Joint Conference on Neural Networks
BT - 2019 International Joint Conference on Neural Networks, IJCNN 2019
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 14 July 2019 through 19 July 2019
ER -