Segmentation mask guided end-to-end person search

Dingyuan Zheng; Jimin Xiao; Kaizhu Huang; Yao Zhao

doi:10.1016/j.image.2020.115876

Segmentation mask guided end-to-end person search

Dingyuan Zheng, Jimin Xiao^*, Kaizhu Huang, Yao Zhao

^*Corresponding author for this work

Research output: Contribution to journal › Article › peer-review

13 Citations (Scopus)

Abstract

Person search aims to search for a target person among multiple images recorded by multiple surveillance cameras, which faces various challenges from both pedestrian detection and person re-identification. Besides the large intra-class variations owing to various illumination conditions, occlusions and varying poses, background clutters in the detected pedestrian bounding boxes further deteriorate the extracted features for each person, making them less discriminative. To tackle these problems, we develop a novel approach which guides the network with segmentation masks so that discriminative features can be learned invariant to the background clutters. We demonstrate that joint optimization of pedestrian detection, person re-identification and pedestrian segmentation enables to produce more discriminative features for pedestrian, and consequently leads to better person search performance. Extensive experiments on two widely used benchmark datasets prove the superiority of our approach. In particular, our proposed model achieves the state-of-the-art performance (86.3% mAP and 86.5% top-1 accuracy) on CUHK-SYSU dataset.

Original language	English
Article number	115876
Journal	Signal Processing: Image Communication
Volume	86
DOIs	https://doi.org/10.1016/j.image.2020.115876
Publication status	Published - Aug 2020

Keywords

Background clutters
Pedestrian detection
Person search
Re-identification
Segmentation masks

Access to Document

10.1016/j.image.2020.115876

Cite this

@article{e9d4083a20704eecadb73dc82ce48f57,

title = "Segmentation mask guided end-to-end person search",

abstract = "Person search aims to search for a target person among multiple images recorded by multiple surveillance cameras, which faces various challenges from both pedestrian detection and person re-identification. Besides the large intra-class variations owing to various illumination conditions, occlusions and varying poses, background clutters in the detected pedestrian bounding boxes further deteriorate the extracted features for each person, making them less discriminative. To tackle these problems, we develop a novel approach which guides the network with segmentation masks so that discriminative features can be learned invariant to the background clutters. We demonstrate that joint optimization of pedestrian detection, person re-identification and pedestrian segmentation enables to produce more discriminative features for pedestrian, and consequently leads to better person search performance. Extensive experiments on two widely used benchmark datasets prove the superiority of our approach. In particular, our proposed model achieves the state-of-the-art performance (86.3% mAP and 86.5% top-1 accuracy) on CUHK-SYSU dataset.",

keywords = "Background clutters, Pedestrian detection, Person search, Re-identification, Segmentation masks",

author = "Dingyuan Zheng and Jimin Xiao and Kaizhu Huang and Yao Zhao",

note = "Publisher Copyright: {\textcopyright} 2020 Elsevier B.V.",

year = "2020",

month = aug,

doi = "10.1016/j.image.2020.115876",

language = "English",

volume = "86",

journal = "Signal Processing: Image Communication",

issn = "0923-5965",

}

TY - JOUR

T1 - Segmentation mask guided end-to-end person search

AU - Zheng, Dingyuan

AU - Xiao, Jimin

AU - Huang, Kaizhu

AU - Zhao, Yao

PY - 2020/8

Y1 - 2020/8

N2 - Person search aims to search for a target person among multiple images recorded by multiple surveillance cameras, which faces various challenges from both pedestrian detection and person re-identification. Besides the large intra-class variations owing to various illumination conditions, occlusions and varying poses, background clutters in the detected pedestrian bounding boxes further deteriorate the extracted features for each person, making them less discriminative. To tackle these problems, we develop a novel approach which guides the network with segmentation masks so that discriminative features can be learned invariant to the background clutters. We demonstrate that joint optimization of pedestrian detection, person re-identification and pedestrian segmentation enables to produce more discriminative features for pedestrian, and consequently leads to better person search performance. Extensive experiments on two widely used benchmark datasets prove the superiority of our approach. In particular, our proposed model achieves the state-of-the-art performance (86.3% mAP and 86.5% top-1 accuracy) on CUHK-SYSU dataset.

AB - Person search aims to search for a target person among multiple images recorded by multiple surveillance cameras, which faces various challenges from both pedestrian detection and person re-identification. Besides the large intra-class variations owing to various illumination conditions, occlusions and varying poses, background clutters in the detected pedestrian bounding boxes further deteriorate the extracted features for each person, making them less discriminative. To tackle these problems, we develop a novel approach which guides the network with segmentation masks so that discriminative features can be learned invariant to the background clutters. We demonstrate that joint optimization of pedestrian detection, person re-identification and pedestrian segmentation enables to produce more discriminative features for pedestrian, and consequently leads to better person search performance. Extensive experiments on two widely used benchmark datasets prove the superiority of our approach. In particular, our proposed model achieves the state-of-the-art performance (86.3% mAP and 86.5% top-1 accuracy) on CUHK-SYSU dataset.

KW - Background clutters

KW - Pedestrian detection

KW - Person search

KW - Re-identification

KW - Segmentation masks

UR - http://www.scopus.com/inward/record.url?scp=85084656377&partnerID=8YFLogxK

U2 - 10.1016/j.image.2020.115876

DO - 10.1016/j.image.2020.115876

M3 - Article

AN - SCOPUS:85084656377

SN - 0923-5965

VL - 86

JO - Signal Processing: Image Communication

JF - Signal Processing: Image Communication

M1 - 115876

ER -

Segmentation mask guided end-to-end person search

Abstract

Keywords

Access to Document

Other files and links

Cite this