期刊文献+

基于谱熵梅尔积的语音端点检测方法 被引量:15

Voice Activity Detection Method Based on MFPH
原文传递
导出
摘要 为了克服传统语音端点检测算法在低信噪比环境下准确率低的问题,提出一种基于谱熵梅尔积(MFPH)的语音端点检测算法.首先,提取带噪语音信号的梅尔频率倒谱系数中的第一维参数MFCC0,将其与谱熵的乘积作为最终区分语音段和背景噪声段的融合特征参数;然后,结合模糊C均值聚类算法和贝叶斯信息准则(BIC)算法对MFPH特征参数门限值进行自适应估计;最后,采用双门限法进行语音端点检测.实验结果证明,与传统方法比较,该方法在-5~15 d B低信噪比环境下的语音端点检测准确率有较大提高. In order to solve the problem that the accuracy of traditional voice activity detection algorithms is low in the low signal-to-noise ratio(SNR)environment,a voice activity detection algorithm based on product of spectral entropy and Mel(MFPH)was proposed.Firstly,the first dimensional parameter MFCC0 of Mel frequency spectrum coefficient of the speech signal with noisy was extracted,and the product of MFCC0 and spectral entropy was taken as fusion characteristic parameter of finally distinguishing speech segment from background noise.Then,the threshold value of MFPH characteristic parameters was estimated adaptively based on combination of fuzzy C-means clustering algorithm(FCM)and Bayesian information criterion(BIC).Finally,the double-threshold method was adopted for the voice activity detection.Experiments show that the accuracy of the proposed method is greatly improved in the-5~15 dB low SNR environment compared with traditional methods.
作者 吴新忠 夏令祥 张旭 周成 WU Xin-zhong;XIA Ling-xiang;ZHANG Xu;ZHOU Cheng(School of Information and Control Engineering,China University of Mining and Technology,Jiangsu Xuzhou 221116,China)
出处 《北京邮电大学学报》 EI CAS CSCD 北大核心 2019年第2期83-89,共7页 Journal of Beijing University of Posts and Telecommunications
基金 “十三五”国家重点研发计划项目(2016YFC0801800) 江苏省重点研发计划项目(BE2016046)
关键词 语音端点检测 梅尔频率倒谱系数 谱熵 谱熵梅尔积 双门限法 低信噪比 voice activity detection Mel frequency spectrum coefficient spectral entropy spectral entropy Mel product double-threshold method low signal-to-noise ratio
  • 相关文献

参考文献3

二级参考文献27

  • 1Ramirez J, Segura J, Benitez C, et al. Efficient voice activity detection algorithms using long-term speech information [J].Speech Commun, 2004, 42(3-4) : 271 - 287.
  • 2Sohn J, Kim N S, Sung W. A statistical model based voice activity detection [J]. IEEE Signal Process Lett, 1999, 6(1):1-3.
  • 3Shin J, Chang J, Kim N. Voice activity detection based on statistical models and machine learning approaches [J]. Computer Speech & Language, 2010, 24(3): 515- 530.
  • 4Jo Q, Chang J, Shin J, Kim N. Statistical model-based voice activity detection using support vector machine [J]. IET Signal Process, 2009, 3(3) : 205 - 210.
  • 5Ramirez J, Yelamos P, Gorriz J, at al. SVM-based speech endpoint detection using contextual speech features [J]. Electron lett, 2006, 42(7): 426-428.
  • 6Scholkopf B, Smola A J. Learning with Kernels [M]. Cambridge, MA: MIT Press, 2002.
  • 7YU Tao, Hansen J H L. Discriminative training for multiple observation likelihood ratio based voice activity detection [J]. IEEE Signal Process Lett, 2010, 17(11): 897-900.
  • 8Ramirez J, Segura J C, Gorriz J M, at al. Improved voice activity detection using contextual multiple hypothesis testing for robust speech recognition[J]. IEEE Trans Audio, Speech Lang Process, 2007, 15(8): 2177-2189.
  • 9Ramirez J, Segura J C, Benitez C, et al. Statistical voice activity detection using a multiple observation likelihood ratio test [J]. IEEE Signal Process Lett, 2005, 12(10): 689 -692.
  • 10Ephraim Y, Malah D. Speech enhancement using a minimummean square error short-time spectral amplitude estimator [J]. IEEE Trans Audio, Speech Lang Process, 1984, 32(6): 1109-1121.

共引文献10

同被引文献102

引证文献15

二级引证文献50

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部