期刊文献+

语音激活检测方法的分析和比较

下载PDF
导出
摘要 语音激活检测(voice activity detection,VAD)是语音信号处理中的一个重要任务,旨在识别出语音信号中的语音段和非语音段。本文将通过实验分析和比较几种目前主流的VAD算法,包括基于多特征流(multiple feature streams,MFS),基于长短时记忆网络(longshort-termmemory,LSTM),基于集成深度神经网络(deep nueral network,DNN),基于自适应上下文注意力机制(adaptive context attention model,ACAM)的方法。综合来看,MFS模型简单、易于部署。当检测目标为含噪声时应尽可能采用深度模型,计算资源充足时可以采用DNN模型,反之则可以采用ACAM模型,它在损失了很少的精度下,将参数数目大幅压缩。
出处 《信息记录材料》 2023年第4期240-242,248,共4页 Information Recording Materials
  • 相关文献

参考文献8

二级参考文献39

  • 1吴启辉 王金龙.基于模糊逻辑二元谱特征的语音检测[A]..解放军理工大学通信工程学院第一届科学报告会论文集[C].,2000.343-347.
  • 2KoJoe Agyei-Kodie. Development of Voiced Activity Detection (VAD) Algorithm that is Robust at Low Signal-to-Noise Ratios, A thesis Proposal Submitted to the Temple University ECE Thesis Committee, 2003-12.
  • 3Beritelli F,Casale S,Rugeri G, et al. Performance Evaluation and Comparison of G.729/AMR/Fuzzy Voice Activity Detector. IEEE Signal Processing Letters, 2002,9(3) :85-88.
  • 4Jongseo Sohn, Wongyong Sung. A Statistical Model-Based Voice Activity Detection, IEEE Signal Processing Letters, 1999-( 1 ) : 1-3.
  • 5Jongseo Sohn, Wongyong Sung. A Voice Activity Detector Employing Soft Decision Based Noise Spectrum Adaptation, IEEE International conference on Acoustics,Speech and Signal Processing, 1998-01:365-368.
  • 6Philippe Renevey, Andrzej Drygajlo, Entropy Based Voice Activity Detection in Very Noisy Conditions, Proceedings of 7th European Conference on Speech Communication and Technology, EUROSPEECH' 2001,Aalborg, Denmark, 2001-09:1 887-1 890.
  • 7WIBOWO S A, USMAN K. Voice activity detection G729B imptx~vemenl technique using K-Nearesl Neighbor method [ C ]//Distributed Framework and Applications (DFmA), 2010 International Conterence on. Yogyakar- ta:IEEE Press, 2010: 1-5.
  • 8SHAIIBAZI A, REZAEI A H, SAYADIYAN A, et al. Data 'Ft'ansmission over GSM Adaptive Mtdti Rate Voice Channel Using Speech-Like Symbols [ C ]//Signal Acqui- sition and Processing, 2010. ICSAP 10. International Conference on. Bangalore: IEEE Press, 2010: 63-67.
  • 9European Telecommunications Standards Institute. 3GPP TS 26. 093 verslonl 1. O. O, Adaptive Multi-Rate (AMR) speech eodee ; Source controlled rate operation ( Releasel I ) [ S]. European : ETS1,2012.
  • 10DIMITRAKOPOUI,OS R, MUSTAPHA H, GLOAGUEN E. High-order statistics of spatial random fields: explo- ring spatial eumulanls tor modeling complex non-Gaussian and non-linear phenomena[J ]. Mathematical Geosci- enees, 2010, 42( 1 ) : 65-99.

共引文献29

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部