TY - GEN
T1 - A new noise-tracking algorithm for generalizing binary time-frequency (T-F) masking to ratio masking
AU - Liang, Shan
AU - Jiang, Wei
AU - Liu, Wenju
PY - 2012
Y1 - 2012
N2 - In this paper, we attempt to generalize the ideal binary mask (IBM) estimation to the ideal ratio mask (IRM) estimation. Under binary masking, the error in IBM estimation may greatly distort the original speech spectrum. The main purpose of this paper is using ratio mask to smooth this negative impact. Since the key issue is the noise tracking, we firstly use exponential distributions to model the distribution of noise power with binary mask and mixture power as condition. Then, we use a Gaussian distribution to model the correlation of noise estimation between adjacent T-F units. As the IBM of majority units can be estimated correctly, the correlation model could reduce the impact introduced by the error in IBM estimation. Systematic experiments show that our algorithm outperforms a common binary masking based method in terms of SNR gain and PESQ scores.
AB - In this paper, we attempt to generalize the ideal binary mask (IBM) estimation to the ideal ratio mask (IRM) estimation. Under binary masking, the error in IBM estimation may greatly distort the original speech spectrum. The main purpose of this paper is using ratio mask to smooth this negative impact. Since the key issue is the noise tracking, we firstly use exponential distributions to model the distribution of noise power with binary mask and mixture power as condition. Then, we use a Gaussian distribution to model the correlation of noise estimation between adjacent T-F units. As the IBM of majority units can be estimated correctly, the correlation model could reduce the impact introduced by the error in IBM estimation. Systematic experiments show that our algorithm outperforms a common binary masking based method in terms of SNR gain and PESQ scores.
KW - Bayesian rule
KW - Ideal binary mask
KW - Ideal ratio mask
KW - Markov chain Monte Carlo
UR - http://www.scopus.com/inward/record.url?scp=84878388262&partnerID=8YFLogxK
M3 - Conference Proceeding
AN - SCOPUS:84878388262
SN - 9781622767595
T3 - 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
SP - 950
EP - 953
BT - 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
T2 - 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
Y2 - 9 September 2012 through 13 September 2012
ER -