期刊文献+

基于EEMD域统计模型的话音激活检测算法 被引量:2

Voice Activity Detection Algorithm Based on Ensemble Empirical Mode Decomposition Domain Statistical Model
下载PDF
导出
摘要 提出了一种基于EEMD域统计模型的话音激活检测算法。算法首先利用总体平均经验模态分解(Ensemble empirical mode decomposition,EEMD)对带噪语音进行分解,得到信号的本征模式函数(Intrinsicmode function,IMF)分量,选择与原信号的相关性最高的两个分量相加组成主分量;然后对主分量进行频域分解,引入统计模型,求出EEMD域特征参数;最后利用噪声与语音的EEMD域特征参数的不同来进行语音激活检测。实验结果表明,在不同信噪比情况下,本文算法性能优于目前常用的VAD算法,特别在噪声强度大时体现出明显的优势。 Voice activity detection algorithm based on ensemble empirical mode decomposition domain statistical model is presented. The noisy speech is decomposed into intrinsic mode func- tion (IMF) components by using ensemble empirical mode decomposition (EEMD) method. Two IMF components with the higher correlation with original speech are added to calculate the characteristic parameter of the statistical model. The decision of the speech/noise is made by comparing the characteristic parameter with its threshold. The proposed VAD algorithm is tested on speech signals under various noise conditions with several SNRs. Experimental resuits show that the proposed VAD algorithm outperforms some standard VAD algorithms, especially under a low SNR noisy condition.
出处 《数据采集与处理》 CSCD 北大核心 2012年第1期51-56,共6页 Journal of Data Acquisition and Processing
基金 江苏省自然科学基金(BK2009059)资助项目
关键词 话音激活检测 经验模式分解 总体平均经验模式分解 EEMD域统计模型 voice activity detection (VAD) empirical mode decomposition (EMD) ensembleempirical mode decomposition (EEMD) EEMD domain statistical model
  • 相关文献

参考文献1

二级参考文献15

  • 1栗学丽,丁慧,徐柏龄.基于熵函数的耳语音声韵分割法[J].声学学报,2005,30(1):69-75. 被引量:34
  • 2杨莉莉,李燕,徐柏龄.汉语耳语音库的建立与听觉实验研究[J].南京大学学报(自然科学版),2005,41(3):311-317. 被引量:13
  • 3韦岗,陆以勤,欧阳景正.混沌、分形理论与语音信号处理[J].电子学报,1996,24(1):34-39. 被引量:33
  • 4拉宾纳.朱雪龙等译.语音信号数字处理[M].北京:科学出版社,1983.
  • 5Drouiche K, Gomez P, Alvarez A, Martinez R, Rodellar V, and Nieto V. A spectral distance measure for speech detection in noise and speech segmentation [C]. Proceedings of the 11th IEEE Signal Processing Workshop on Statistical Signal Processing, Singapore, 2001: 500-503.
  • 6Chen S H, Liao Y F, and Chiang S M, et al.. An RNN-based pre-classification method for fast continuous mandarin speech recognition [J]. IEEE Trans. on Speech and Audio Processing, 1998, 6(1): 86-90.
  • 7Robert W M and Mark A C. Reconstruction of speech from whispers [J]. Medical Engineering & Physics, 2002, 24(8): 515-520.
  • 8Higashikawa M. Perceived pitch of whispered vowels - relationship with formant frequencies: a preliminary study [J], Journal of Voice, 1996, 10(2): 155-158.
  • 9Huang N E, Shen Z, and Long S R, et al.. The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis [J]. J Proc. R. Soc. Lond. A, 1998, 454: 903-995.
  • 10Liu Z F, Liao Z P, and Sang E F. Speech enhancement based on Hilbert-Huang transform [C]. Proceedings of 2005 International Conference on Machine Learning and Cybernetics, Guangzhou, China, 2005, 8: 4908-4912.

共引文献5

同被引文献22

引证文献2

二级引证文献25

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部