StableTTS: Towards Efficient Denoising Acoustic Decoder for Text to Speech Synthesis with Consistency Flow Matching

  • Zhiyong Chen*
  • , Xinnuo Li*
  • , Shuhang Wu
  • , Zhi Yang
  • , Zhiqi Ai
  • , Shugong Xu
  • *Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

Abstract

Current state-of-the-art text-to-speech (TTS) systems predominantly utilize denoising-based acoustic decoders with language models (LLMs) or with non-autoregressive front-ends, known for their superior performance in generating high-fidelity spectrum. In this study, we introduce an efficient TTS system that incorporates Consistency Flow Matching denoising training. This training approach significantly enhances the training efficiency and operational performance of denoising-based acoustic decoders in existing TTS or voice conversion systems, with no additional cost in the training process - a free lunch. To efficiently compare with other denoising strategies, we align with the latest advancements in the implementation of non-autoregressive-based TTS systems and build an efficient DiT-based TTS architecture. Our comprehensive evaluations against various denoising-based methods affirm the efficiency of our proposed system.

Original languageEnglish
Title of host publication2025 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops, ICASSPW 2025 - Workshop Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798331519315
DOIs
Publication statusPublished - 2025
Externally publishedYes
Event2025 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops, ICASSPW 2025 - Hyderabad, India
Duration: 6 Apr 202511 Apr 2025

Publication series

Name2025 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops, ICASSPW 2025 - Workshop Proceedings

Conference

Conference2025 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops, ICASSPW 2025
Country/TerritoryIndia
CityHyderabad
Period6/04/2511/04/25

Keywords

  • component
  • formatting
  • insert
  • style
  • styling

Cite this