期刊文献+

Whisper intelligibility enhancement based on noise robust feature and SVM 被引量:2

基于噪声鲁棒性特征和SVM的耳语音可懂度增强(英文)
下载PDF
导出
摘要 A machine learning based speech enhancement method is proposed to improve the intelligibility of whispered speech. A binary mask estimated by a two-class support vector machine (SVM) classifier is used to synthesize the enhanced whisper. A novel noise robust feature called Gammatone feature cosine coefficients (GFCCs) extracted by an auditory periphery model is derived and used for the binary mask estimation. The intelligibility performance of the proposed method is evaluated and compared with the traditional speech enhancement methods. Objective and subjective evaluation results indicate that the proposed method can effectively improve the intelligibility of whispered speech which is contaminated by noise. Compared with the power subtract algorithm and the log-MMSE algorithm, both of which do not improve the intelligibility in lower signal-to-noise ratio (SNR) environments, the proposed method has good performance in improving the intelligibility of noisy whisper. Additionally, the intelligibility of the enhanced whispered speech using the proposed method also outperforms that of the corresponding unprocessed noisy whispered speech. 提出了一种基于机器学习的耳语音可懂度增强方法.该方法利用已经训练好的2类支持向量机来估计一个二元时频掩蔽值,进而合成增强后的耳语音.输入支持向量机的特征向量GFCCs是基于听觉外周模型进行提取的,具有噪声鲁棒特性.在增强仿真实验中,将该算法同传统语音增强算法进行语音可懂度增强性能比较.客观评价和主观听力实验结果均表明,所提出的方法能有效提高含噪耳语音的听觉可懂度;相比谱减法和log-MMSE方法在低信噪比时无法提高语音可懂度,该方法在低信噪比时仍可有效提高含噪耳语音的听觉可懂度.此外,含噪耳语音通过所提出的方法进行增强后,其可懂度比未增强时明显提高.
出处 《Journal of Southeast University(English Edition)》 EI CAS 2012年第3期261-265,共5页 东南大学学报(英文版)
基金 The National Natural Science Foundation of China (No.61231002,61273266,51075068,60872073,60975017, 61003131) the Ph.D.Programs Foundation of the Ministry of Education of China(No.20110092130004) the Science Foundation for Young Talents in the Educational Committee of Anhui Province(No. 2010SQRL018) the 211 Project of Anhui University(No.2009QN027B)
关键词 whispered speech intelligibility enhancement noise robust feature machine learning 耳语音 可懂度增强 噪声鲁棒性特征 机器学习
  • 相关文献

参考文献12

  • 1Tartter V C. What's in a whisper? [ J]. The Journal of the Acoustical Society of America, 1989, 86(5) : 1678 - 1683.
  • 2Ito T, Takeda K, Takura F. Analysis and recognition of whispered speech [ J]. Speech Communication, 2005, 45 (2) : 139 - 152.
  • 3McAulay R, Malpass M. Speech enhancement using a soft-decision noise suppression filter [ J]. IEEE Transac- tions on Acoustics, Speech and Signal Processing, 1980, 28(2) : 137 - 145.
  • 4Ephraim Y, Malah D. Speech enhancement using a mini- mum mean-square error log-spectral amplitude estimator [ J]. IEEE Transactions on Acoustics, Speech and Signal Processing, 1985, 33(2) : 443 -445.
  • 5Loizou P C, Kim G. Reasons why current speech-en- hancement algorithms do not improve speech intelligibility and suggested solutions [ J]. 1EEE Transactions on Audio, Speech, and Language Processing, 2011, 19(1) : 47 - 56.
  • 6Cooke M, Ellis D P W. The auditory organization of speech and other sources in listeners and computational models [ J]. Speech Communication, 2001, 35 (3/4) : 141 -177.
  • 7Bregman A S. Auditory scene analysis: the perceptual or- ganization of sound [ M]. Cambridge: The MIT Press, 1994.
  • 8Wang D L, Kjems U, Pedersen M S, et al. Speech intel- ligibility in background noise with ideal binary time-fre- quency masking [ J]. The Journal of the Acoustical Socie- ty of America, 2009, 125(4) : 2336 -2347.
  • 9Li N, Loizou P C. Factors influencing intelligibility of ideal binary-masked speech: implications for noise reduc- tion [ J]. The Journal of the Acoustical Society of Ameri- ca, 2008, 123(3) : 1673 - 1682.
  • 10Varga A, Steeneken H. Assessment for automatic speech recognition: Ⅱ. NOISEX-92: a database and an experi- ment to study the effect of additive noise on speech recog- nition systems [J]. Speech Communication, 1993, 12 (3) : 247 -251.

同被引文献11

引证文献2

二级引证文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部