GWPF: Communication-Efficient Federated Learning with Gradient-Wise Parameter Freezing

Duo Yang, Yunqi Gao, Bing Hu*, A-Long Jin, Wei Wang, Yang You

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Communication bottleneck is a critical challenge in federated learning. While parameter freezing has emerged as a popular approach, utilizing fine-grained parameters as aggregation objects, existing methods suffer from issues such as a lack of thawing strategy, lag and inflexibility in the thawing process, and underutilization of frozen parameters’ updates. To address these challenges, we propose Gradient-Wise Parameter Freezing (GWPF), a mechanism that wisely controls frozen periods for different parameters through parameter freezing and thawing strategies. GWPF globally freezes parameters with insignificant gradients and excludes frozen parameters from global updates during the frozen period, reducing communication overhead and accelerating training. The thawing strategy, based on global decisions by the server and collaboration with clients, leverages real-time feedback on the locally accumulated gradients of frozen parameters in each round, achieving a balanced approach between mitigating communication and enhancing model accuracy. We provide theoretical analysis and a convergence guarantee for non-convex objectives. Extensive experiments confirm that our mechanism achieves a speedup of up to 4.52 times in time-to-accuracy performance and reduces communication overhead by up to 48.73%. It also improves final model accuracy by up to 2.01% compared to the existing fastest method APF. The code for GWPF is available at https://github.com/Dora233/GWPF.

Original languageEnglish
Article number110886
JournalComputer Networks
Volume255
DOIs
Publication statusPublished - Dec 2024

Keywords

  • Communication mitigation
  • Federated learning
  • Frozen period
  • Parameter freezing
  • Thawing strategy

Fingerprint

Dive into the research topics of 'GWPF: Communication-Efficient Federated Learning with Gradient-Wise Parameter Freezing'. Together they form a unique fingerprint.

Cite this