Interference robust DOA estimation of human speech by exploiting historical information and temporal correlation

Wei Xue, Shan Liang, Wenju Liu

Research output: Contribution to journalConference articlepeer-review

5 Citations (Scopus)

Abstract

Although various DOA estimation methods for human speech have been presented, most of them assume noises received by different microphones are undirected. However, strong directional interferences often also exist in practical scenarios and the performances of existing methods degrade seriously in such case. In this paper, we present a new interference robust DOA estimation method for human speech. Historical information and temporal correlation are taken advantage to deal with the problem. Firstly, utilizing the historical DOA estimates, we perform "post-beamforming" in the last frame to suppress the directional interferences. Then exploiting temporal correlation of speech spectra, frequency weights which highlight the effects of speech frequency bins are calculated based on the estimated a priori SNR of enhanced signal. Finally, we propose a new DOA cost function using frequency-weighted spatial correlation matrix to estimate the DOA of speech source. Experimental results show that the proposed method outperforms existing algorithms in reverberant environments with additive white Gaussian noises in the presence of different kinds of interferences.

Original languageEnglish
Pages (from-to)2895-2899
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Publication statusPublished - 2013
Externally publishedYes
Event14th Annual Conference of the International Speech Communication Association, INTERSPEECH 2013 - Lyon, France
Duration: 25 Aug 201329 Aug 2013

Keywords

  • Direction of arrival estimation
  • Directional noise
  • Microphone array signal processing

Fingerprint

Dive into the research topics of 'Interference robust DOA estimation of human speech by exploiting historical information and temporal correlation'. Together they form a unique fingerprint.

Cite this