Many speech enhancement algorithms that deal with noise reduction are based on a binary masking decision(termed as the hard decision), which may cause some regions of the synthesized speech to be discarded. In view of...Many speech enhancement algorithms that deal with noise reduction are based on a binary masking decision(termed as the hard decision), which may cause some regions of the synthesized speech to be discarded. In view of the problem, a soft decision is often used as an optimal technique for speech restoration. In this paper, considering a new fashion of speech and noise models, we present two model-based soft decision techniques. One technique estimates a ratio mask generated by the exact Bayesian estimators of speech and noise. For the second technique, we consider one issue that an optimum local criterion(LC) for a certain SNR may not be appropriate for other SNRs. So we estimate a probabilistic mask with a variable LC. Experimental results show that the proposed method achieves a better performance than reference methods in speech quality.展开更多
Generalized cross-correlation is considered as the most straightforward time delay estimation algorithm.Depending on various weighting function,different methods were derived and a straightforward method,named phase t...Generalized cross-correlation is considered as the most straightforward time delay estimation algorithm.Depending on various weighting function,different methods were derived and a straightforward method,named phase transform(PHAT)has been widely used.PHAT is well-known for its robustness to reverberation and its sensitivity to noise,which is partly due to the fact that PHAT distributes same weights to the frequencies dominated by signal or noise.To alleviate this problem,two weighting functions are proposed in this paper.By taking a posteriori signal-to-noise ratio(SNR)into account to classify reliable and unreliable frequencies,different weights could be assigned.The first proposed weighting function borrows the idea of binary mask and distributes same weights to frequencies in same set,whereas,the second one assigns weights based on coherence function.Experiments showed the robustness of proposed methods to reverberation and noise for improving the performance of time delay estimation through various criteria.展开更多
基金supported by the National Natural Science Foundation of China (Grant No.61471014,61231015)
文摘Many speech enhancement algorithms that deal with noise reduction are based on a binary masking decision(termed as the hard decision), which may cause some regions of the synthesized speech to be discarded. In view of the problem, a soft decision is often used as an optimal technique for speech restoration. In this paper, considering a new fashion of speech and noise models, we present two model-based soft decision techniques. One technique estimates a ratio mask generated by the exact Bayesian estimators of speech and noise. For the second technique, we consider one issue that an optimum local criterion(LC) for a certain SNR may not be appropriate for other SNRs. So we estimate a probabilistic mask with a variable LC. Experimental results show that the proposed method achieves a better performance than reference methods in speech quality.
基金supported by the National Natural Science Foundation of China(Grant No.61831019).
文摘Generalized cross-correlation is considered as the most straightforward time delay estimation algorithm.Depending on various weighting function,different methods were derived and a straightforward method,named phase transform(PHAT)has been widely used.PHAT is well-known for its robustness to reverberation and its sensitivity to noise,which is partly due to the fact that PHAT distributes same weights to the frequencies dominated by signal or noise.To alleviate this problem,two weighting functions are proposed in this paper.By taking a posteriori signal-to-noise ratio(SNR)into account to classify reliable and unreliable frequencies,different weights could be assigned.The first proposed weighting function borrows the idea of binary mask and distributes same weights to frequencies in same set,whereas,the second one assigns weights based on coherence function.Experiments showed the robustness of proposed methods to reverberation and noise for improving the performance of time delay estimation through various criteria.