Whispered speech enhancement using auditory masking model in modified Mel- domain and Speech Absence Probability (SAP) was proposed. In light of the phonation char- acteristic of whisper, we modify the Mel-frequency...Whispered speech enhancement using auditory masking model in modified Mel- domain and Speech Absence Probability (SAP) was proposed. In light of the phonation char- acteristic of whisper, we modify the Mel-frequency Scaling model. Whispered speech is filtered by the proposed model. Meanwhile, the value of masking threshold for each frequency band is dynamically determined by speech absence probability. Then whispered speech enhancement is conducted by adaptively rectifying the spectrum subtraction coefficients using different masking threshold values. Results of objective and subjective tests on the enhanced whispered signal show that compared with other methods; the proposed method can enhance whispered signal with better subjective auditory quality and less distortion by reducing the music noise and background noise under the masking threshold value.展开更多
The performance of classic Mel-frequency cepstral coefficients (MFCC) is unsatisfactory in noisy environment with different sound sources from nature. In this paper, a classification approach of the ecological environ...The performance of classic Mel-frequency cepstral coefficients (MFCC) is unsatisfactory in noisy environment with different sound sources from nature. In this paper, a classification approach of the ecological environmental sounds using the double-level energy detection (DED) was presented. The DED was used to detect the existence of the sound signals under noise conditions. In addition, MFCC features from the frames which were detected the presence of the sound signals by DED were extracted. Experimental results show that the proposed technology has better noise immunity than classic MFCC, and also outperforms time-domain energy detection (TED) and frequency-domain energy detection (FED) respectively.展开更多
基金supported by the National Natural Science Foundation of China(61071215)the University Natural Science Research Project of Jiangsu Province(05KJB510113)
文摘Whispered speech enhancement using auditory masking model in modified Mel- domain and Speech Absence Probability (SAP) was proposed. In light of the phonation char- acteristic of whisper, we modify the Mel-frequency Scaling model. Whispered speech is filtered by the proposed model. Meanwhile, the value of masking threshold for each frequency band is dynamically determined by speech absence probability. Then whispered speech enhancement is conducted by adaptively rectifying the spectrum subtraction coefficients using different masking threshold values. Results of objective and subjective tests on the enhanced whispered signal show that compared with other methods; the proposed method can enhance whispered signal with better subjective auditory quality and less distortion by reducing the music noise and background noise under the masking threshold value.
文摘The performance of classic Mel-frequency cepstral coefficients (MFCC) is unsatisfactory in noisy environment with different sound sources from nature. In this paper, a classification approach of the ecological environmental sounds using the double-level energy detection (DED) was presented. The DED was used to detect the existence of the sound signals under noise conditions. In addition, MFCC features from the frames which were detected the presence of the sound signals by DED were extracted. Experimental results show that the proposed technology has better noise immunity than classic MFCC, and also outperforms time-domain energy detection (TED) and frequency-domain energy detection (FED) respectively.