期刊文献+

基于共振峰谐波能量的语音端点检测 被引量:11

Speech endpoint detection based on the formant-consonance energy
原文传递
导出
摘要 共振峰和谐波成分是语音的一个典型特征。由于语音和环境的多变性,采用普通的方法提取这些特征存在很多困难。该文提出了一种在窄带语谱图上通过图像增强的方法,通过sobel算子计算窄带语谱图的方向场,通过Gabor滤波增强谐波区域,通过门限方法得到二值化图,去除方向大于45°和依赖度低的点,得到连续的水平方向的带状分布,即谐波分布区域,求取谐波分布区域内的能量,以此作为门限判决的特征。实验结果表明,在不同信噪比、多种非平稳噪声环境下都能够达到较好的语音检出效果;同时这个特征不但能抑制高能量突发噪声,而且在非平稳噪声背景下的语音检测也表现出了优秀性能。其优点为,不需要噪声的先验知识,充分利用了语音在频率域和时间域的相关性,适应于各种非平稳复杂噪声。 Formant and consonance are two discriminable features of speech,but these features are difficult to extract due to the wide variety of speech and many complex backgrounds.This paper presents an image enhancement method to calculate the formant consonance energy parameter by identifying the consonance region in a narrow-band spectrogram.The consonance region is identified through orientation estimation,consonance enhancement,binarisation,and post-processing using the Sobel operator,the Gabor filter,a thresho...
出处 《清华大学学报(自然科学版)》 EI CAS CSCD 北大核心 2008年第S1期754-759,共6页 Journal of Tsinghua University(Science and Technology)
关键词 语音端点检测 共振峰谐波能量 图像增强 窄带语谱图 speech endpoint detection formant-consonance energy image enhancement narrow-band spectrogram
  • 相关文献

参考文献10

  • 1Shen J L,,Hung J W,Lee L S.Robust entropy-basedendpoint detection for speech recognition in noisyenvironments[].International Conference on SpokenLanguage Processing.1998
  • 2Tucker R.Voice activity detection using a periodicitymeasure[].IEEE Proceedings of CommunicationsSpeechand Vision.1992
  • 3Beritelli F,Casale S,Ruggeri G,et al.Performanceevaluation and comparison of G.729/AMR/fuzzy voiceactivity detectors[].IEEE Signal Processing Letters.2002
  • 4Hong L,Wan Y,Jain A K.Fingerprint image enhancement:Algorithm and performance evaluation[].IEEETransactions on Pattern Analysis and Machine Intelligence.1998
  • 5Daugman J G.Uncertainty relation for resolution in space,spatial frequency,and orientation optimized bytwo-dimensional visual cortical filters[].Journal of the Optical Society of America.1985
  • 6Asadi A,Schwartz R,Makhoul J.Automatic modeling foradding new words to a large-vocabulary continuous speechrecognition system[].Proc of the IEEE Int Conf onAcousticsSpeech and Signal Processing.1991
  • 7Wu G D,Lin C T.Word boundary detection with mel-scale frequency bank in noisy environment[].IEEE Transactions on Speech and Audio Processing.2000
  • 8Nemer E.Robust voice activity detection using higher-order statistics in the LPC residual domain[].IEEE Transactions on Speech and Audio Processing.2001
  • 9Kyoung-Ho Woo.Robust voice activity detection algorithm for estimating noise spectrum[].Electronics Letters.2000
  • 10Anil K. and Farshid Farrokhnia.Unsupervised texture segmentation using gabor filters[].Pattern Recognition.1991

同被引文献106

引证文献11

二级引证文献34

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部