Loss and Double-edge-triggered Detector for Robust Small-footprint Keyword Spotting

Bin Liu, Shuai Nie, Yaping Zhang, Shan Liang, Zhanlei Yang, Wenju Liu

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

13 Citations (Scopus)

Abstract

Keyword spotting (KWS) system constitutes a critical component of human-computer interfaces, which detects the specific keyword from a continuous stream of audio. The goal of KWS is providing a high detection accuracy at a low false alarm rate while having small memory and computation requirements. The DNN-based KWS system faces a large class imbalance during training because the amount of data available for the keyword is usually much less than the background speech, which overwhelms training and leads to a degenerate model. In this paper, we explore the focal loss for the training of a small-footprint KWS system. It can automatically down-weight the contribution of easy samples during training and focus the model on hard samples, which naturally solves the class imbalance and allows us to efficiently utilize all data available. Furthermore, many keywords of Chinese conversational assistants are repeated words due to the idiomatic usage, such as 'XIAO DU XIAO DU'. We propose a double-edge-triggered detecting method for the repeated keyword, which significantly reduces the false alarm rate relative to the single threshold method. Systematic experiments demonstrate significant further improvements compared to the baseline system.

Original languageEnglish
Title of host publication2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages6361-6365
Number of pages5
ISBN (Electronic)9781479981311
DOIs
Publication statusPublished - 2019
Externally publishedYes
Event44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Brighton, United Kingdom
Duration: 12 May 201917 May 2019

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2019-May
ISSN (Print)1520-6149

Conference

Conference44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019
Country/TerritoryUnited Kingdom
CityBrighton
Period12/05/1917/05/19

Keywords

  • double-edge-triggered detecting method
  • focal loss
  • keyword spotting
  • speech recognition

Fingerprint

Dive into the research topics of 'Loss and Double-edge-triggered Detector for Robust Small-footprint Keyword Spotting'. Together they form a unique fingerprint.

Cite this