期刊文献+

基于修正Mel域掩蔽模型和无语音概率的耳语音增强 被引量:2

Speech enhancement based on modified Mel masking model and speech absence probability in whispers
原文传递
导出
摘要 提出了一种基于修正Mel域听觉掩蔽模型和无语音概率的耳语音增强方法。该方法根据耳语音的发音特点对Mel频率进行修正,对每一帧耳语音信号进行Mel域频带滤波,同时通过无语音概率(SAP)动态地确定每个频带的听觉掩蔽阈值,对不同的听觉掩蔽阈值自适应地调整谱减系数来进行耳语音增强。对增强后的耳语音进行客观和主观测试,结果表明,该方法与其它谱减法相比,能将残留噪声和背景噪声控制在人耳掩蔽阈值下,取得更小的语音失真,主观听觉也得到了很大的改善。 A method of whispered speech enhancement using auditory masking model in modified Mel-domain and Speech Absence Probability (SAP) is proposed. In light of the phonation characteristic of whispered speech, we modify the Mel Frequency Scaling model. Whispered speech is filtered by the proposed model. Meanwhile, the value of masking threshold for each frequency band is dynamically determined by speech absence probability. Then whisper speech enhancement is conducted by adaptively rectifying the spectrum subtraction coefficients using different masking threshold values. Results of objective and subjective tests on the enhanced whispered speech signal show that compared with other methods, the proposed method can enhance whispered speech signal with better subjective auditory quality and less distortion by reducing the music noise and background noise under the masking threshold value.
出处 《声学学报》 EI CSCD 北大核心 2009年第4期370-377,共8页 Acta Acustica
基金 国家自然科学基金(60572076) 江苏省高校自然科学基金(05KJB510113)资助项目
关键词 语音增强 听觉掩蔽 Mel 概率 模型 掩蔽阈值 噪声控制 耳语音 Distillation Frequency bands Probability Speech recognition
  • 相关文献

参考文献4

二级参考文献65

  • 1陈韬,李昌立,莫福源.汉语孤立字全音节实时识别系统[J].声学学报,1993,18(3):161-171. 被引量:4
  • 2LIXueli,XUBoling.Tone features in whispered Chinese[J].Progress in Natural Science:Materials International,2005,15(3):285-288. 被引量:5
  • 3卜凡亮,王为民,戴启军,陈砚圃.基于噪声被掩蔽概率的优化语音增强方法[J].电子与信息学报,2005,27(5):753-756. 被引量:16
  • 4潘凌云,孙达传,吴美朝.语音识别中基于语谱图的语音音素分割方法[J].杭州大学学报(自然科学版),1995,22(1):42-46. 被引量:7
  • 5齐士钤 张家禄.汉语普通话辅音音长分析[J].声学学报,1982,(1):8-13.
  • 6曹剑芬.现代语音基础知识[M].北京:人民教育出版社,1990..
  • 7Taisuke Itoh, Kazuya Takeda and Fumitada Itakura.Acoustic analysis and recognition of whispered speech. In:Proc. ICASSP, Orlando, Florida, USA, 2002:389-392.
  • 8Robert W. Morris, Mark A. Clements. Reconstruction of speech from whispers. Medical Engineering ~ Physics,2002; 24(8): 515-520.
  • 9Higashikawa M, Nakai K, Sakakura A, Takahashi H. Perceived pitch of whispered vowels-relationship with formant frequencies: a preliminary study. Journal of Voice,1996; 10(2): 155-158.
  • 10Izmirli O. Using a spectral flatness based feature for audio segmentation and retrieval. In: Proc. International Symposium on Music Information Retrieval, Plymouth, USA,2000:100-101.

共引文献77

同被引文献16

引证文献2

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部