TY - JOUR
T1 - ERDBF
T2 - Embedding-Regularized Double Branches Fusion for Multi-Modal Age Estimation
AU - Wu, Bo
AU - Lu, Hengjie
AU - Chen, Zhiyong
AU - Zhu, Congcong
AU - Xu, Shugong
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2023
Y1 - 2023
N2 - Human age information is present in the face image and speech, but most age estimation methods focus on single-modal data only. Although multi-modal approaches have achieved promising performance in other fields by leveraging complementarity between modalities, the development of age estimation has hardly been more inspired by them because of the problem of modality missing in age data. In this paper, we propose an Embedding-Regularized Double Branches Fusion (ERDBF) framework that can simultaneously handle single-modal and multi-modal age estimation. To address the modality missing, we design a double branches fusion network which consists of an embedding regularization module and an information interaction module. The former aims to enhance the representational capacity of single-modal features. The latter learns inter-modal complementary information. Through their collaboration in the fusion process, the network can extract discriminative age representations, and even if a certain modality is missing, robust age estimation can be achieved using enhanced single-modal information. To our best knowledge, the proposed framework is the first deep learning-based multi-modal age estimation framework that can easily utilize the achievements made in the single-modal and have broader applications. Experimental results show that our best method achieves state-of-the-art results on AgeVoxCeleb. The code is available at https://github.com/Daretowin/ERDBF.
AB - Human age information is present in the face image and speech, but most age estimation methods focus on single-modal data only. Although multi-modal approaches have achieved promising performance in other fields by leveraging complementarity between modalities, the development of age estimation has hardly been more inspired by them because of the problem of modality missing in age data. In this paper, we propose an Embedding-Regularized Double Branches Fusion (ERDBF) framework that can simultaneously handle single-modal and multi-modal age estimation. To address the modality missing, we design a double branches fusion network which consists of an embedding regularization module and an information interaction module. The former aims to enhance the representational capacity of single-modal features. The latter learns inter-modal complementary information. Through their collaboration in the fusion process, the network can extract discriminative age representations, and even if a certain modality is missing, robust age estimation can be achieved using enhanced single-modal information. To our best knowledge, the proposed framework is the first deep learning-based multi-modal age estimation framework that can easily utilize the achievements made in the single-modal and have broader applications. Experimental results show that our best method achieves state-of-the-art results on AgeVoxCeleb. The code is available at https://github.com/Daretowin/ERDBF.
KW - age estimation
KW - double branches fusion network
KW - embedding regularization
KW - modality missing
KW - Multi-modal fusion
UR - http://www.scopus.com/inward/record.url?scp=85160684547&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2023.3275765
DO - 10.1109/ACCESS.2023.3275765
M3 - Article
AN - SCOPUS:85160684547
SN - 2169-3536
VL - 11
SP - 47608
EP - 47618
JO - IEEE Access
JF - IEEE Access
ER -