Abstract
Although ratio mask may achieve better speech separation results than that by binary mask, present speech separation systems usually set Ideal Binary Mask (IBM) as the computational goal due to the fact that it's very difficult to estimate Ideal Ratio Mask (IRM) directly. In this paper, a generalization algorithm from the binary mask to ratio mask is proposed. Since the key issue in IRM estimation is the noise tracking, we firstly use exponential distribution to model the noise power with binary mask and mixture power as conditions. Then, we use a Gaussian Markov Random Field (GMRF) to model the correlation of noise estimation between adjacent units. Finally, we apply Markov Chain Monte Carlo method to compute the minimum mean square error estimation of noise power and ratio mask. Systematic experiments show that the proposed algorithm outperforms a common binary masking based method in terms of SNR gain and PESQ scores.
Original language | English |
---|---|
Pages (from-to) | 632-637 |
Number of pages | 6 |
Journal | Shengxue Xuebao/Acta Acustica |
Volume | 38 |
Issue number | 5 |
Publication status | Published - May 2013 |
Externally published | Yes |