期刊文献+

提高耳语音可懂度的非对称压缩语音增强方法 被引量:3

An asymmetric attenuated speech enhancement approach for improving intelligibility of noisy whisper
下载PDF
导出
摘要 提出两种基于非对称代价函数的耳语音增强算法,将语音增强过程中的放大失真和压缩失真区分对待。Modified ItakuraSaito(MIS)算法对放大失真给予更多的惩罚,而Kullback-Leibler(KL)算法则对压缩失真给予更多的惩罚。实验结果表明,在低于—6 dB的低信噪比情况中,经MIS算法增强后的耳语音的可懂度相比传统算法有显著提高;而KL算法则获得了同最小均方误差语音增强算法近似的可懂度提高效果,证实了耳语音中的放大失真和压缩失真对于耳语音可懂度的影响并不相同,低信噪比时较大的压缩失真有助于提高耳语音可懂度,而高信噪比时的压缩失真对耳语音可懂度影响较小。 Two asymmetric cost function for whispered speech enhancement methods are proposed. The cost of the amplification distortion and the attenuation distortion are different in both methods. The Modified Itakura-Saito (MIS) distance function gives more penalties to speech amplification distortion while the Kullback-Leibler (KL) divergence function gives more penalties to speech attenuation distortion. The experimental results show that the MIS method gains larger intelligibility improvement of the whispered speech than the conventional speech enhancement algorithms in much lower Signal to Noise Ratio (SNR) less than -6 dB, and the KL method has similar intelligibility improvement performance to the Minimum Mean Square Error (MMSE) speech enhancement method. The results confirm that the amplification distortion and the attenuation distortion have different effects on the intelligibility of the enhanced whisper. Specifically, larger attenuation distortion can improve speech intelligibility in lower SNR condition and it has a little influence on speech intelligibility in high SNR condition.
出处 《声学学报》 EI CSCD 北大核心 2014年第4期501-508,共8页 Acta Acustica
基金 国家自然科学基金(61301295 61231002 61273266 61003131) 安徽省自然科学基金(1308085QF100 1408085MF113) 安徽大学博士科研启动经费资助
  • 相关文献

参考文献20

  • 1Tartter V C. Identifiability of vowels and speakers from whispered syllables. Attention, Perception, Psychophy- sies, 1991, 49(4): 365-372.
  • 2王敏,赵鹤鸣.基于多带解调分析和瞬时频率估计的耳语音话者识别[J].声学学报,2010,35(4):471-476. 被引量:12
  • 3陶智,赵鹤鸣,谈雪丹,顾济华,张晓俊,吴迪.采用扩展型双线性变换法将耳语音转换为正常语音的研究[J].声学学报,2012,37(6):651-658. 被引量:4
  • 4顾晓江,赵鹤鸣,吕岗.模型与特征混合补偿法及其在耳语说话人识别中的应用[J].声学学报,2012,37(2):198-203. 被引量:4
  • 5Jin Yun, Zhao Yan, Huang Chengwei, Zhao Li. Study on the emotion recognition of whispered speech. In: Zhou Shangming, Wang Wenwu ed. GCIS2009, Proceedings of WRI Global Congress on Intelligent Systems, Xiamen, China, 2009, Piscataway, N J: IEEE, 2009:242-246.
  • 6Li Junfeng, Yang Lin, Zhang Jianping, Yan Yonghong. Comparative intelligibility investigation of single-channel noise-reduction algorithms for Chinese, 3apanese, and En- glish. The Journal of the Acoustical Society of America, 2011, 129(5): 3291-3301.
  • 7杨琳,张建平,颜永红.单通道语音增强算法对汉语语音可懂度影响的研究[J].声学学报,2010,35(2):248-253. 被引量:17
  • 8Loizou P C, Kim G. Reasons why current speech- enhancement algorithms do not improve speech intelligibil- ity and suggested solutions. IEEE Transactions on Audio, Speech, and Language Processing, 2011, 19(1): 47-56.
  • 9Ephraim Y, Malah D. Speech enhancement using a mini- mum mean-square error log-spectral amplitude estimator. IEEE Transactions on Acoustics, Speech and Signal Pro- cessing, 1985 33(2): 443-445.
  • 10Cohen I. Noise spectrum estimation in adverse environ- ments: Improved minima controlled recursive averaging. IEEE Transactions on Speech and Audio Processing, 2003, 11(5): 466-475.

二级参考文献51

共引文献30

同被引文献55

  • 1Boll S. Suppression of acoustic noise in speech using spec- tral subtraction[ J ]. Acoustics Speech & Signal Processing IEEE Transactions on, 1979, 27 (2) : 113-120.
  • 2Scalart P, Filho J V. Speech enhancement based on a priori signal to noise estimation [ C ]//IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Atlanta, 1996, 2 : 629-632.
  • 3Ephraim Y, Malah D. Speech enhancement using a mini- mum-mean square error short-time spectral amplitude es- timator[ J]. Acoustics, Speech and Signal Processing, IEEE Transactions on, 1984, 32(6) : 1109-1121.
  • 4Zhou J, Liang R, Zhao L, et al. Whisper Intelligibility Enhancement Using a Supervised Learning Approach [ J ]. Circuits Systems & Signal Processing, 2012, 31 (6): 2061-2074.
  • 5Chen J, Wang Y, Wang D L. A feature study for classifi- cation-based speech separation at very low signal-to-noise ratio [ C ] //IEEE International Conference on Acoustics, Speech and Signal Processing ( ICASSP ), Florence, 2014 : 7039 -7043.
  • 6Li N, Loizou P C. Factors influencing intelligibility of i- deal binary-masked speech: Implications for noise reduc- tion [ J ]. The Journal of the Acoustical Society of Ameri- ca, 2008, 123(3): 1673-1682.
  • 7Kim G, Loizou P C. Improving speech intelligibility in noise using environment-optimized algorithms [J]. IEEE Transactions on Audio, Speech, and Language Process- ing, 2010, 18(8): 2080-2090.
  • 8Kim G, Loizou P C. A new binary mask based on noise constraints for improved speech intelligibility [ C ]//IN- TERSPEECH, Chiba, Japan, 2010: 1632-1635.
  • 9Li N, Bao C C, Xia B Y, et al. Speech intelligibility im- provement using the constraints on speech distortion and noise over-estimation [ C ] //IEEE International Confer- ence on Intelligent Information Hiding and Multimedia Signal Processing, Beijing, 2013: 602-606.
  • 10Kim G. Binary Mask Criteria Based on Distortion Con- straints Induced by a Gain Function for Speech Enhance- ment [ J ]. IEIE Transactions on Smart Processing and Computing, 2013, 2(4): 197-202.

引证文献3

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部