This paper addresses the problem of single channel speech enhancement under stationary and non-stationary environments, which based on the masking properties of human auditory system. This algorithm can overcome the deficiency of the conventional speech enhancement algorithms, which were only efficient for stationary environments and have large level of musical residual noise. During the estimation of power spectrum of the speech, the parameters of the estimator can be modified by the MMSE and the masking threshold of the speech, by this way, we can find the best trade off among the amount of noise reduction, the speech distortion and the level of musical residual noise. For the best tracking the variation of the environment, the method of minimum statistics was introduced for noise power spectrum estimation. Objective and subjective evaluation of the proposed algorithm is performed with several noise types in the Noisex-92 database with different time frequency distributions. The evaluations confirm that the enhanced speech by proposed algorithm is more pleasant to a human listener for every noise conditions.
Journal of Signal Processing