TY - JOUR
T1 - Triple loss for hard face detection
AU - Fang, Zhenyu
AU - Ren, Jinchang
AU - Marshall, Stephen
AU - Zhao, Huimin
AU - Wang, Zheng
AU - Huang, Kaizhu
AU - Xiao, Bing
N1 - Publisher Copyright:
© 2020 Elsevier B.V.
PY - 2020/7/20
Y1 - 2020/7/20
N2 - Although face detection has been well addressed in the last decades, despite the achievements in recent years, effective detection of small, blurred and partially occluded faces in the wild remains a challenging task. Meanwhile, the trade-off between computational cost and accuracy is also an open research problem in this context. To tackle these challenges, in this paper, a novel context enhanced approach is proposed with structural optimization and loss function optimization. For loss function optimization, we introduce a hierarchical loss, referring to ``triple loss'' in this paper, to optimize the feature pyramid network (FPN) (Lin et al., 2017) based face detector. Additional layers are only applied during the training process. As a result, the computational cost is the same as FPN during inference. For structural optimization, we propose a context sensitive structure to increase the capacity of the prediction network to improve the accuracy of the output. In details, a three-branch inception subnet (Szegedy et al., 2015) based feature fusion module is employed to refine the original FPN without increasing the computational cost significantly, further improving low-level semantic information, which is originally extracted from a single convolutional layer in the backward pathway of FPN. The proposed approach is evaluated on two publicly available face detection benchmarks, FDDB and WIDER FACE. By using a VGG-16 based detector, experimental results indicate that the proposed method achieves a good balance between the accuracy and computational cost of face detection.
AB - Although face detection has been well addressed in the last decades, despite the achievements in recent years, effective detection of small, blurred and partially occluded faces in the wild remains a challenging task. Meanwhile, the trade-off between computational cost and accuracy is also an open research problem in this context. To tackle these challenges, in this paper, a novel context enhanced approach is proposed with structural optimization and loss function optimization. For loss function optimization, we introduce a hierarchical loss, referring to ``triple loss'' in this paper, to optimize the feature pyramid network (FPN) (Lin et al., 2017) based face detector. Additional layers are only applied during the training process. As a result, the computational cost is the same as FPN during inference. For structural optimization, we propose a context sensitive structure to increase the capacity of the prediction network to improve the accuracy of the output. In details, a three-branch inception subnet (Szegedy et al., 2015) based feature fusion module is employed to refine the original FPN without increasing the computational cost significantly, further improving low-level semantic information, which is originally extracted from a single convolutional layer in the backward pathway of FPN. The proposed approach is evaluated on two publicly available face detection benchmarks, FDDB and WIDER FACE. By using a VGG-16 based detector, experimental results indicate that the proposed method achieves a good balance between the accuracy and computational cost of face detection.
KW - Efficiency-accuracy balance
KW - Face detection
KW - Face feature fusion
KW - Single shot detection
KW - Small face
UR - http://www.scopus.com/inward/record.url?scp=85080134640&partnerID=8YFLogxK
U2 - 10.1016/j.neucom.2020.02.060
DO - 10.1016/j.neucom.2020.02.060
M3 - Article
AN - SCOPUS:85080134640
SN - 0925-2312
VL - 398
SP - 20
EP - 30
JO - Neurocomputing
JF - Neurocomputing
ER -