Masking-based Neural Beamformer for Multichannel Speech Enhancement

Shuai Nie, Shan Liang, Zhanlei Yang, Longshuai Xiao, Wenju Liu, Jianhua Tao

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

1 Citation (Scopus)

Abstract

Masking and beamforming techniques have shown considerable promise for multichannel speech enhancement. Masking technology can significantly reduce noise, but inevitably leads to speech distortion, especially in far-field reverb environments. While beamforming technology can effectively avoid speech distortion and perform very well in reverberant conditions. Obviously, masking-based beamforming scheme is a wise alternative. However, most methods use fixed or adaptive beamformers as spatial filters, which are either desinged in advance under certain sound field assumption, with limited noise reduction ability, or involve the complex matrix inverse operation of each frequency, with high computational complexity and instability. In this paper, we propose a fully learnable masking neural beamformer to jointly model masking and beamforming in a data-driven manner. Mask prediction and neural beamformer are jointly optimized by the spectrum and waveform approximation objectives. To improve the directional discrimination in reverb and diffuse noise environments, we further propose to use a pair of complementary fixed beamformers to exploit directional coherence feature (DCF) for mask prediction. Systematic experiments demonstrate the proposed approach is competitive with abailable methods in terms of speech enhancement and speech recognition.

Original languageEnglish
Title of host publication2022 13th International Symposium on Chinese Spoken Language Processing, ISCSLP 2022
EditorsKong Aik Lee, Hung-yi Lee, Yanfeng Lu, Minghui Dong
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages125-129
Number of pages5
ISBN (Electronic)9798350397963
DOIs
Publication statusPublished - 2022
Externally publishedYes
Event13th International Symposium on Chinese Spoken Language Processing, ISCSLP 2022 - Singapore, Singapore
Duration: 11 Dec 202214 Dec 2022

Publication series

Name2022 13th International Symposium on Chinese Spoken Language Processing, ISCSLP 2022

Conference

Conference13th International Symposium on Chinese Spoken Language Processing, ISCSLP 2022
Country/TerritorySingapore
CitySingapore
Period11/12/2214/12/22

Keywords

  • directional coherence features
  • masking neural beamformer
  • multichannel speech enhancement

Fingerprint

Dive into the research topics of 'Masking-based Neural Beamformer for Multichannel Speech Enhancement'. Together they form a unique fingerprint.

Cite this