期刊文献+

基于隐马尔可夫模型的非监督噪声功率谱估计 被引量:4

Unsupervised Noise Power Estimation Using Hidden Markov Model
下载PDF
导出
摘要 噪声功率谱估计是语音增强算法的基本组成部分,传统算法大多采用启发式的估计方法,因而不能保证噪声估计值的统计最优。提出了一种基于极大似然的非监督噪声功率谱估计方法,采用隐马尔可夫模型(Hidden Markov model,HMM)在每个子带建立语音和非语音对数功率谱的统计模型,模型包含语音和非语音两个高斯分量,其中非语音高斯分量的均值表示噪声功率谱估计值,根据最大期望(Expectation maximization,EM)算法得到包括噪声均值在内的HMM参数集。针对语音信号可能出现的长时缺失,对HMM引入了一些约束条件,保证了模型的稳定性。实验表明,该方法获得的极大似然噪声估计优于基于启发式的经典方法获得的噪声估计。 Noise estimation is a fundamental part of speech enhancement.Most traditional methods are heuristic which can not enable the optimal estimation.An unsupervised noise power estimation is presented based on maximum likelihood.A log-power statistical model is constructed using hidden Markov model(HMM)in each subband.This model comprises speech and nonspeech Gauss components,and the mean value of nonspeech Gauss component is the estimation of noise power.Moreover,speech may be long-term absent,some constraints are introduced to this model for stability.The experiments validate that the proposed method can obtain the maximum likelihood noise estimation and outperforms conventional heuristic methods.
出处 《数据采集与处理》 CSCD 北大核心 2015年第2期359-364,共6页 Journal of Data Acquisition and Processing
基金 国家重点基础研究发展计划("九七三"计划)(2013CB329302)资助项目 国家自然科学基金(61271426 10925419 90920302 61072124 11074275 11161140319)资助项目 中国科学院战略性先导科技专项(XDA06030100 XDA06030500)资助项目 中国科学院重点部署(KGZD-EW-103-2)资助项目 江西理工大学科研基金(NSFJ2015-G21)资助项目
关键词 语音增强 噪声功率谱估计 隐马尔可夫模型 极大似然准则 模型约束 speech enhancement noise power estimation hidden Markov model maximum likelihood criterion model constraints
  • 相关文献

参考文献14

  • 1Yuan Wenhao, Lin Jiajun, An Wei, et al. Noise estimation based on time-frequency correlation for speech enhancement[J]. Applied Acoustics, 2013, 74(5): 770-781.
  • 2赵胜跃,戴蓓蒨.基于最小统计噪声估计的信号子空间语音增强[J].数据采集与处理,2007,22(4):453-457. 被引量:6
  • 3Zhong L, Rafik A G, Richard M D. Noise estimation using speech/non-speech frame decision and subband spectral tracking[J]. Speech Communication, 2007, 49: 542-557.
  • 4余耀,赵鹤鸣.非平稳噪声环境下的噪声功率谱估计方法[J].数据采集与处理,2012,27(4):486-489. 被引量:7
  • 5Martin R. Bias compensation methods for minimum statistics noise power spectral density estimation[J]. Signal Processing, 2006, 86: 1215-1229.
  • 6Cohen I. Noise estimation by minima controlled reeursive averaging for robust speech enhancement[J]. IEEE Signal Process Letters, 2002, 9(1):12-15.
  • 7Cohen I. Noise spectrum estimation in adverse environments: Improved minima controlled recursive averaging[J]. IEEE Transaction on Audio, Speech, and Language Processing, 2003, 11(5): 466-475.
  • 8Quoc V Le. Building high-level features using large scale unsupervised learning [C]//Proc ICASSP13. Vancouver, Canada: IEEE Signal Processing Society, 2013: 8595-8598.
  • 9Frederic P, Yacine C, Ovarlez J P, et al. Covarianee structure maximum-likelihood estimates in compound Gaussian noise: Existence and algorithm analysis[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2008, 56 (1) :34-48.
  • 10Ying D, Yan Y, Dang J, et al. Voice activity detection based on an unsupervised learning framework[J]. IEEE Transaction on Audio, Speech, and Language Processing, 2011, 19(8):2624 -2633.

二级参考文献21

  • 1Ephraim Y, Van Trees H L. A signal subspace approach for speech enhancement [J]. IEEE Trans Speech and Audio Processing, 1995,3(4):251-266.
  • 2Gazor S, Rezayee A. An adaptive KLT approach for speech Enhancement[J]. IEEE Trans on Speech and Audio Processing, 2001,9(2): 97-95.
  • 3Lev-Ari H, Ephraim Y. Extension of the signal subspace speech enhancementapproach to colored noise [J]. IEEE Signal Processing Lett, 2003, 10(4) :104-106.
  • 4Jabloun F, Champagne B. Incorporating the human hearing properties in the signal subspace approach for speech enhancement[J]. IEEE Transactions on Speech and Audio Processing, 2003, 11 (6): 700- 708.
  • 5Gazor S, Zhang W. Speech enhancement employing Laplacian-gaussian mixture[J]. IEEE Transactions on Speech and Audio Processing, 2005, 13 (5):896- 904.
  • 6Martin R. Noise power spectral density estimation based on optimal smoothing and minimum statistics [J]. IEEE Trans on Speech and Audio Processing, 2001,9(5):504-512
  • 7Wan E, Nelson A, Peterson R. Speech enhancement assessment resource (SPEAR) database [EB/OL]. http://ee, ogi. edu/NSEL/Beta Release vl. 0. CSLU, Oregon Graduate Institute of Science and Technology. 1998.
  • 8International Telecommunication Union. Recommendation ITU-T P. 862. Perceptual evaluation of speech quality (PESQ) [S]. 2001.
  • 9Farsi H. Improvement of minimum tracking in mini- mum statistics noise estimation method [J]. Signal Processing.. An International Journal (SPIJ), 2010, 4(1) ..17-22.
  • 10Rainer M. Noise power spectral density estimation based on optimal smoothing and minimum statistics [J]. IEEE Transactions on Speech and Audio Pro- cessing, 2001,9(5):504-512.

共引文献11

同被引文献41

引证文献4

二级引证文献11

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部