期刊文献+

一种基于奇异谱的语音激活检测方法 被引量:1

A method of voice activity detection based on spectrum of singular value
下载PDF
导出
摘要 为了提高语音激活检测在低信噪比环境中的检测性能,提出了一种基于奇异谱的语音激活检测方法。首先用多窗口方法计算每一帧语音信号的相关矩阵;然后对相关矩阵进行奇异值分解;利用奇异值可以反映有用信号和噪声分布情况的特性,将每一帧语音信号经过加权处理后的最大奇异值与自适应阈值进行比较进行语音激活检测。该方法原理简单,易于硬件实现,通过实验仿真表明,在低信噪比环境下,和基于对数能量方法相比,本文方法也能够很好的区分语音段和非语音段,有良好的检测性能。 In order to improve the performance of voice activation detection at low SNR(Signal to Noise Ratio), we proposed a detection approach of voice activity based on singular spectrum. Firstly, we calculate the correlation matrix for each frame of speech signal with multi-window approach; then performed singular value decomposition to the correlation matrix; due to the singular value reflects the characteristics of the useful signal and noise distribution, we can perform activity detection through comparing the weighted maximum singular value of each frame of speech signal with the adaptive threshold value. This method is simple and can be easily implemented in hardware. The simulation indicates that compared with energy method based on logarithm, in low SNR environment, this approach can better distinguish speech segments with non-voice segment better.
出处 《应用声学》 CSCD 北大核心 2013年第2期137-143,共7页 Journal of Applied Acoustics
基金 国家自然科学基金项目(61071196 61102131) 教育部新世纪优秀人才支持计划项目(NCET-10-0927) 信号与信息处理重庆市市级重点实验室建设项目(CSTC2009CA2003) 重庆市杰出青年基金项目(CSTC2011jjjq40002) 重庆市自然科学基金项目(CSTC2009BB2287 CSTC2010BB2398 CSTC2010BB2409 CSTC2010BB2411)资助
关键词 语音激活检测 Slepian数据窗 离散扁椭圆序列 相关矩阵 奇异值分解 自适应阈值 Voice activity detection, Slepian data window, Discrete prolate spheroidal sequences,Correlation matrix, Singular value decomposition, Adaptive threshold
  • 相关文献

参考文献12

  • 1LEE H, YOOK D. Space-time voice activity detection[J]. IEEE Transaction Signal Process, 2009, 55(3)" 1471-1476.
  • 2MARZNZIK M, KOLLMEIER B. Speech pause detection for noise spectrum estimation by tracking power envelope dynamics[J]. IEEE Transaction on Speech and Audio Processing. 2002, 10(2): 109-118.
  • 3LI Qi, ZHANG Jinsong, TSAI A, et al. Robust endpoint detection and energy normalization for real-time speech and speaker recognition[J]. IEEE Transaction on Speech and Audio Processing.2002, 10(3): 146-157.
  • 4朱晓晶,侯旭初,崔慧娟,唐昆.基于LPCC和能量熵的端点检测[J].电讯技术,2010,50(6):41-45. 被引量:6
  • 5GAZOR S, ZHANG W. A soft voice activity detector based on a laplacian-gaussian model[J]. IEEE Transaction on Signal and Audio Processing. 2003, 11(5): 498-505.
  • 6CHANG J H, NAM S K, MITRA S K. Voice activity detection based on multiple statistical models[J]. IEEE Transactions on Signal Processing. 2006, 54(6): 1965-1976.
  • 7RIGOZO N R, ECHER E, NORDEMANN D J R, et al. Comparative study between for classical spectral analysis methods[J]. Applied Mathematics and Computation, 2005, 168( 1 ): 411-430.
  • 8ALLEN B, OTTEWILL A. Multi-taper spectral analysis in gravitational wave data analysis[J]. General Relativity and Gravitation, 2000, 32(3): 385-398.
  • 9BANSAL A R, DIMRI V P, SAGAR G V. Depth estimation from gravity data using the maximum entropy method and the multi taper method[J]. Pure and Applied Geophysics. 2006, (163): 1417-1434.
  • 10WU Bingfei, WANG Kunching. Rubost Endpoint detection algorithm based on the adaptive band-partitioning spectral entropy in adverse environments[J]. IEEE Transaction on Speech and Audio Processing, 2005, 13(5): 762-775.

二级参考文献30

  • 1刘晓明,覃胜,刘宗行,江泽佳.语音端点检测的仿真研究[J].系统仿真学报,2005,17(8):1974-1976. 被引量:21
  • 2李晔,张仁智,崔慧娟,唐昆.低信噪比下基于谱熵的语音端点检测算法[J].清华大学学报(自然科学版),2005,45(10):1397-1400. 被引量:37
  • 3侯周国,钱盛友,姚畅.短时域语音端点检测中谱熵算法的改进[J].计算机工程与应用,2006,42(21):55-56. 被引量:3
  • 4Junqua J C,Mak B,Reaves B.A robust algorithm for word boundary detection in the presence of noise[J].IEEE Transactions on Speech and Audio Processing,1994,2(3):406-412.
  • 5Beritelli F,Casale S,Ruggeri G,et al.Performances evaluation and comparision of G.729/AMR/fuzzy voice activity detectors[J].IEEE Signal Processing Letters,2002,9(3):85-88.
  • 6Pencak J,Neloson D.The NP speech activity detection algorithm[C]//Proceedings of 1995 International Conference on Acoustics,Speech and Signal Processing.Detroit,MI,USA:[s.n.],1995:381-384.
  • 7Reynolds D,Rose R.Robust text-independent speaker identification using Gaussian mixture speaker models[J].IEEE Transactions on Speech and Audio Processing,1995,3(1):72-83.
  • 8Reynolds D A,Quatieri T F,Dunn R B.Speaker Verification Using Adapted Gaussian Mixture Models[J].Digital Signal Processing,2000,10(1):19-41.
  • 9Dempster A D,Laird N M,Rubin D B.Maximum likelihood from incomplete data via the EM algorithm[J].Journal of the Royal Statistical Society,1977,39(2):1-37.
  • 10Gish H,Schmid M.Text-Independent Speaker Identification[J].IEEE Signal Processing Magazine,1994,11(4):18-32.

共引文献41

同被引文献7

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部