Learning domain-heterogeneous speaker recognition systems with personalized continual federated learning

Zhiyong Chen; Shugong Xu

doi:10.1186/s13636-023-00299-2

Learning domain-heterogeneous speaker recognition systems with personalized continual federated learning

Zhiyong Chen, Shugong Xu^*

^*Corresponding author for this work

Shanghai University

Research output: Contribution to journal › Article › peer-review

2 Citations (Scopus)

Abstract

Speaker recognition, the process of automatically identifying a speaker based on individual characteristics in speech signals, presents significant challenges when addressing heterogeneous-domain conditions. Federated learning, a recent development in machine learning methods, has gained traction in privacy-sensitive tasks, such as personal voice assistants in home environments. However, its application in heterogeneous multi-domain scenarios for enhancing system customization remains underexplored. In this paper, we propose the utilization of federated learning in heterogeneous situations to enable adaptation across multiple domains. We also introduce a personalized federated learning algorithm designed to effectively leverage limited domain data, resulting in improved learning outcomes. Furthermore, we present a strategy for implementing the federated learning algorithm in practical, real-world continual learning scenarios, demonstrating promising results. The proposed federated learning method exhibits superior performance across a range of synthesized complex conditions and continual learning settings, compared to conventional training methods.

Original language	English
Article number	33
Journal	Eurasip Journal on Audio, Speech, and Music Processing
Volume	2023
Issue number	1
DOIs	https://doi.org/10.1186/s13636-023-00299-2
Publication status	Published - Dec 2023
Externally published	Yes

Keywords

Continual learning
Domain adaptation
Federated learning
Speaker recognition

Access to Document

10.1186/s13636-023-00299-2

Cite this

@article{98a9d5e461ae4b9e882309ac8c44e5f3,

title = "Learning domain-heterogeneous speaker recognition systems with personalized continual federated learning",

abstract = "Speaker recognition, the process of automatically identifying a speaker based on individual characteristics in speech signals, presents significant challenges when addressing heterogeneous-domain conditions. Federated learning, a recent development in machine learning methods, has gained traction in privacy-sensitive tasks, such as personal voice assistants in home environments. However, its application in heterogeneous multi-domain scenarios for enhancing system customization remains underexplored. In this paper, we propose the utilization of federated learning in heterogeneous situations to enable adaptation across multiple domains. We also introduce a personalized federated learning algorithm designed to effectively leverage limited domain data, resulting in improved learning outcomes. Furthermore, we present a strategy for implementing the federated learning algorithm in practical, real-world continual learning scenarios, demonstrating promising results. The proposed federated learning method exhibits superior performance across a range of synthesized complex conditions and continual learning settings, compared to conventional training methods.",

keywords = "Continual learning, Domain adaptation, Federated learning, Speaker recognition",

author = "Zhiyong Chen and Shugong Xu",

note = "Publisher Copyright: {\textcopyright} 2023, Springer Nature Switzerland AG.",

year = "2023",

month = dec,

doi = "10.1186/s13636-023-00299-2",

language = "English",

volume = "2023",

journal = "Eurasip Journal on Audio, Speech, and Music Processing",

issn = "1687-4714",

number = "1",

}

TY - JOUR

T1 - Learning domain-heterogeneous speaker recognition systems with personalized continual federated learning

AU - Chen, Zhiyong

AU - Xu, Shugong

PY - 2023/12

Y1 - 2023/12

N2 - Speaker recognition, the process of automatically identifying a speaker based on individual characteristics in speech signals, presents significant challenges when addressing heterogeneous-domain conditions. Federated learning, a recent development in machine learning methods, has gained traction in privacy-sensitive tasks, such as personal voice assistants in home environments. However, its application in heterogeneous multi-domain scenarios for enhancing system customization remains underexplored. In this paper, we propose the utilization of federated learning in heterogeneous situations to enable adaptation across multiple domains. We also introduce a personalized federated learning algorithm designed to effectively leverage limited domain data, resulting in improved learning outcomes. Furthermore, we present a strategy for implementing the federated learning algorithm in practical, real-world continual learning scenarios, demonstrating promising results. The proposed federated learning method exhibits superior performance across a range of synthesized complex conditions and continual learning settings, compared to conventional training methods.

AB - Speaker recognition, the process of automatically identifying a speaker based on individual characteristics in speech signals, presents significant challenges when addressing heterogeneous-domain conditions. Federated learning, a recent development in machine learning methods, has gained traction in privacy-sensitive tasks, such as personal voice assistants in home environments. However, its application in heterogeneous multi-domain scenarios for enhancing system customization remains underexplored. In this paper, we propose the utilization of federated learning in heterogeneous situations to enable adaptation across multiple domains. We also introduce a personalized federated learning algorithm designed to effectively leverage limited domain data, resulting in improved learning outcomes. Furthermore, we present a strategy for implementing the federated learning algorithm in practical, real-world continual learning scenarios, demonstrating promising results. The proposed federated learning method exhibits superior performance across a range of synthesized complex conditions and continual learning settings, compared to conventional training methods.

KW - Continual learning

KW - Domain adaptation

KW - Federated learning

KW - Speaker recognition

UR - http://www.scopus.com/inward/record.url?scp=85169836357&partnerID=8YFLogxK

U2 - 10.1186/s13636-023-00299-2

DO - 10.1186/s13636-023-00299-2

M3 - Article

AN - SCOPUS:85169836357

SN - 1687-4714

VL - 2023

JO - Eurasip Journal on Audio, Speech, and Music Processing

JF - Eurasip Journal on Audio, Speech, and Music Processing

IS - 1

M1 - 33

ER -

Learning domain-heterogeneous speaker recognition systems with personalized continual federated learning

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this