期刊文献+

基于支持向量机与多观测复合特征矢量的语音端点检测 被引量:3

Support vector machine based VAD using the multiple observation compound feature
原文传递
导出
摘要 该文提出了一种新的多观测复合特征(MO-CF)用于基于支持向量机(SVM)的语音端点检测(VAD)。该特征是由2个子特征经平衡因子加权构成。特征的优化目标是寻找能使VAD的性能曲线下面积(AUC)最大化的平衡因子,以综合各个子特征的优点。在子特征选择方面,要求各个子特征不仅本身具有较好的性能,而且存在互补性。针对该要求,提出2种组合特征MO-CF1和MO-CF2。由多观测信噪比(MO-SNR)特征与多观测最大概率(MO-MP)特征复合而成的MO-CF2比MO-CF1更稳健。实验结果表明:在多种噪声环境下,相比于已有的9种VAD算法,该算法具有更好的性能和更高的稳健性。 A multiple observation compound feature(MO-CF) is presented for support vector machine(SVM) based statistical voice activity detection(VAD).The MO-CF is composed of at least two sub-features with balancing factors.The optimal balanced factor is chosen to yield the largest area under the ROC curve(AUC) of the performance.The selected sub-features must not only have good performance themselves but also be complementary with each other.A multiple-observation signal-to-noise ratio sub-feature is then combined with a multiple-observation maximum probability sub-feature to achieve more robust performance.Tests show that the algorithm gives better performance than 9 commonly used VAD techniques for various noisy scenarios with low SNRs.
出处 《清华大学学报(自然科学版)》 EI CAS CSCD 北大核心 2011年第9期1209-1214,共6页 Journal of Tsinghua University(Science and Technology)
关键词 多观测复合特征矢量(MO-CF) 支持向量机(SVM) 语音端点检测(VAD) multiple observation compound feature support vector machine voice activity detection
  • 相关文献

参考文献15

  • 1Ramirez J, Segura J, Benitez C, et al. Efficient voice activity detection algorithms using long-term speech information [J].Speech Commun, 2004, 42(3-4) : 271 - 287.
  • 2Sohn J, Kim N S, Sung W. A statistical model based voice activity detection [J]. IEEE Signal Process Lett, 1999, 6(1):1-3.
  • 3Shin J, Chang J, Kim N. Voice activity detection based on statistical models and machine learning approaches [J]. Computer Speech & Language, 2010, 24(3): 515- 530.
  • 4Jo Q, Chang J, Shin J, Kim N. Statistical model-based voice activity detection using support vector machine [J]. IET Signal Process, 2009, 3(3) : 205 - 210.
  • 5Ramirez J, Yelamos P, Gorriz J, at al. SVM-based speech endpoint detection using contextual speech features [J]. Electron lett, 2006, 42(7): 426-428.
  • 6Scholkopf B, Smola A J. Learning with Kernels [M]. Cambridge, MA: MIT Press, 2002.
  • 7YU Tao, Hansen J H L. Discriminative training for multiple observation likelihood ratio based voice activity detection [J]. IEEE Signal Process Lett, 2010, 17(11): 897-900.
  • 8Ramirez J, Segura J C, Gorriz J M, at al. Improved voice activity detection using contextual multiple hypothesis testing for robust speech recognition[J]. IEEE Trans Audio, Speech Lang Process, 2007, 15(8): 2177-2189.
  • 9Ramirez J, Segura J C, Benitez C, et al. Statistical voice activity detection using a multiple observation likelihood ratio test [J]. IEEE Signal Process Lett, 2005, 12(10): 689 -692.
  • 10Ephraim Y, Malah D. Speech enhancement using a minimummean square error short-time spectral amplitude estimator [J]. IEEE Trans Audio, Speech Lang Process, 1984, 32(6): 1109-1121.

同被引文献17

引证文献3

二级引证文献18

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部