ERDBF: Embedding-Regularized Double Branches Fusion for Multi-Modal Age Estimation

Bo Wu; Hengjie Lu; Zhiyong Chen; Congcong Zhu; Shugong Xu

doi:10.1109/ACCESS.2023.3275765

ERDBF: Embedding-Regularized Double Branches Fusion for Multi-Modal Age Estimation

Bo Wu, Hengjie Lu, Zhiyong Chen, Congcong Zhu, Shugong Xu^*

^*Corresponding author for this work

Research output: Contribution to journal › Article › peer-review

1 Citation (Scopus)

Abstract

Human age information is present in the face image and speech, but most age estimation methods focus on single-modal data only. Although multi-modal approaches have achieved promising performance in other fields by leveraging complementarity between modalities, the development of age estimation has hardly been more inspired by them because of the problem of modality missing in age data. In this paper, we propose an Embedding-Regularized Double Branches Fusion (ERDBF) framework that can simultaneously handle single-modal and multi-modal age estimation. To address the modality missing, we design a double branches fusion network which consists of an embedding regularization module and an information interaction module. The former aims to enhance the representational capacity of single-modal features. The latter learns inter-modal complementary information. Through their collaboration in the fusion process, the network can extract discriminative age representations, and even if a certain modality is missing, robust age estimation can be achieved using enhanced single-modal information. To our best knowledge, the proposed framework is the first deep learning-based multi-modal age estimation framework that can easily utilize the achievements made in the single-modal and have broader applications. Experimental results show that our best method achieves state-of-the-art results on AgeVoxCeleb. The code is available at https://github.com/Daretowin/ERDBF.

Original language	English
Pages (from-to)	47608-47618
Number of pages	11
Journal	IEEE Access
Volume	11
DOIs	https://doi.org/10.1109/ACCESS.2023.3275765
Publication status	Published - 2023
Externally published	Yes

Keywords

age estimation
double branches fusion network
embedding regularization
modality missing
Multi-modal fusion

Access to Document

10.1109/ACCESS.2023.3275765

Cite this

@article{d0073dc064cb4baca2db8cd9ad6b2109,

title = "ERDBF: Embedding-Regularized Double Branches Fusion for Multi-Modal Age Estimation",

abstract = "Human age information is present in the face image and speech, but most age estimation methods focus on single-modal data only. Although multi-modal approaches have achieved promising performance in other fields by leveraging complementarity between modalities, the development of age estimation has hardly been more inspired by them because of the problem of modality missing in age data. In this paper, we propose an Embedding-Regularized Double Branches Fusion (ERDBF) framework that can simultaneously handle single-modal and multi-modal age estimation. To address the modality missing, we design a double branches fusion network which consists of an embedding regularization module and an information interaction module. The former aims to enhance the representational capacity of single-modal features. The latter learns inter-modal complementary information. Through their collaboration in the fusion process, the network can extract discriminative age representations, and even if a certain modality is missing, robust age estimation can be achieved using enhanced single-modal information. To our best knowledge, the proposed framework is the first deep learning-based multi-modal age estimation framework that can easily utilize the achievements made in the single-modal and have broader applications. Experimental results show that our best method achieves state-of-the-art results on AgeVoxCeleb. The code is available at https://github.com/Daretowin/ERDBF.",

keywords = "age estimation, double branches fusion network, embedding regularization, modality missing, Multi-modal fusion",

author = "Bo Wu and Hengjie Lu and Zhiyong Chen and Congcong Zhu and Shugong Xu",

note = "Publisher Copyright: {\textcopyright} 2013 IEEE.",

year = "2023",

doi = "10.1109/ACCESS.2023.3275765",

language = "English",

volume = "11",

pages = "47608--47618",

journal = "IEEE Access",

issn = "2169-3536",

}

TY - JOUR

T1 - ERDBF

T2 - Embedding-Regularized Double Branches Fusion for Multi-Modal Age Estimation

AU - Wu, Bo

AU - Lu, Hengjie

AU - Chen, Zhiyong

AU - Zhu, Congcong

AU - Xu, Shugong

PY - 2023

Y1 - 2023

N2 - Human age information is present in the face image and speech, but most age estimation methods focus on single-modal data only. Although multi-modal approaches have achieved promising performance in other fields by leveraging complementarity between modalities, the development of age estimation has hardly been more inspired by them because of the problem of modality missing in age data. In this paper, we propose an Embedding-Regularized Double Branches Fusion (ERDBF) framework that can simultaneously handle single-modal and multi-modal age estimation. To address the modality missing, we design a double branches fusion network which consists of an embedding regularization module and an information interaction module. The former aims to enhance the representational capacity of single-modal features. The latter learns inter-modal complementary information. Through their collaboration in the fusion process, the network can extract discriminative age representations, and even if a certain modality is missing, robust age estimation can be achieved using enhanced single-modal information. To our best knowledge, the proposed framework is the first deep learning-based multi-modal age estimation framework that can easily utilize the achievements made in the single-modal and have broader applications. Experimental results show that our best method achieves state-of-the-art results on AgeVoxCeleb. The code is available at https://github.com/Daretowin/ERDBF.

AB - Human age information is present in the face image and speech, but most age estimation methods focus on single-modal data only. Although multi-modal approaches have achieved promising performance in other fields by leveraging complementarity between modalities, the development of age estimation has hardly been more inspired by them because of the problem of modality missing in age data. In this paper, we propose an Embedding-Regularized Double Branches Fusion (ERDBF) framework that can simultaneously handle single-modal and multi-modal age estimation. To address the modality missing, we design a double branches fusion network which consists of an embedding regularization module and an information interaction module. The former aims to enhance the representational capacity of single-modal features. The latter learns inter-modal complementary information. Through their collaboration in the fusion process, the network can extract discriminative age representations, and even if a certain modality is missing, robust age estimation can be achieved using enhanced single-modal information. To our best knowledge, the proposed framework is the first deep learning-based multi-modal age estimation framework that can easily utilize the achievements made in the single-modal and have broader applications. Experimental results show that our best method achieves state-of-the-art results on AgeVoxCeleb. The code is available at https://github.com/Daretowin/ERDBF.

KW - age estimation

KW - double branches fusion network

KW - embedding regularization

KW - modality missing

KW - Multi-modal fusion

UR - http://www.scopus.com/inward/record.url?scp=85160684547&partnerID=8YFLogxK

U2 - 10.1109/ACCESS.2023.3275765

DO - 10.1109/ACCESS.2023.3275765

M3 - Article

AN - SCOPUS:85160684547

SN - 2169-3536

VL - 11

SP - 47608

EP - 47618

JO - IEEE Access

JF - IEEE Access

ER -

ERDBF: Embedding-Regularized Double Branches Fusion for Multi-Modal Age Estimation

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this