TRCA: Text Restoration for Chinese ASR with BERT

Xing Wu*, Yuan Zhang, Jianjia Wang, Yike Guo

*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

Abstract

Text restoration plays a vital role in Chinese automatic speech recognition (ASR), which includes punctuation prediction and error correction. However, there are two inevitable challenges for this task. On the one hand, there are no public dataset and model for Chinese punctuation prediction. On the other hand, current text restoration methods for automatic speech recognition only focus on Chinese error correction instead of combining with Chinese punctuation prediction task. To address these problems, a BERT-based text restoration method called TRCA is proposed for Chinese ASR consisting of a Chinese punctuation prediction model and a Chinese error correction model. Experiments demonstrate that the proposed TRCA method outperforms state-of-the-art methods for both punctuation prediction and error correction tasks, among which the proposed TRCA improves the average accuracy to 98% in Chinese punctuation prediction.

Original languageEnglish
Title of host publicationNew Trends in Intelligent Software Methodologies, Tools and Techniques - Proceedings of the 21st International Conference on New Trends in Intelligent Software Methodologies, Tools and Techniques, SoMeT 2022
EditorsHamido Fujita, Yutaka Watanobe, Takuya Azumi
PublisherIOS Press BV
Pages661-668
Number of pages8
ISBN (Electronic)9781643683164
DOIs
Publication statusPublished - 14 Sept 2022
Externally publishedYes
Event21st International Conference on New Trends in Intelligent Software Methodologies, Tools and Techniques, SoMeT 2022 - Kitakyushu, Japan
Duration: 20 Sept 202222 Sept 2022

Publication series

NameFrontiers in Artificial Intelligence and Applications
Volume355
ISSN (Print)0922-6389
ISSN (Electronic)1879-8314

Conference

Conference21st International Conference on New Trends in Intelligent Software Methodologies, Tools and Techniques, SoMeT 2022
Country/TerritoryJapan
CityKitakyushu
Period20/09/2222/09/22

Keywords

  • BERT
  • Chinese ASR text restoration
  • Chinese error correction
  • Chinese punctuation prediction

Fingerprint

Dive into the research topics of 'TRCA: Text Restoration for Chinese ASR with BERT'. Together they form a unique fingerprint.

Cite this