期刊文献+

面向语音增强的约束序贯高斯混合模型噪声功率谱估计 被引量:6

Noise power estimation based on constrained sequential Gaussian mixture model for speech enhancement
下载PDF
导出
摘要 提出了一种基于极大似然的噪声对数功率谱估计方法,采用高斯混合模型对每一个频带上的功率谱包络构建统计模型,将时序包络划分为语音和非语音类,它们分别对应于高斯混合模型的两个高斯分量,描述语音和非语音的统计分布,其中非语音高斯分量的均值即为噪声功率谱的最优估计.采用序贯学习的方法,在极大似然准则下逐帧更新模型参数,并逐帧给出噪声功率谱的最优估计值。此外,由于序贯更新过程中语音信号长时缺失,容易导致模型失稳,提出了一种在线的最小描述长度准则(MDL)来判断语音信号是否长时缺失,从而保证了模型的稳定性.实验表明,算法性能整体优于经典的MS和IMCRA算法。 An approach to estimate the noise logarithmic power was presented based on maximal likelihood. The two-component Gaussian mixture model (GMM) is utilized to describe the distribution of logarithmic power of noisy speech, where one component denotes the speech ("speech+noise') power distribution and the other component denotes the non-speech power distribution. The mean of non-speech component is optimal estimate of noise power. An on-line method is presented to update the parameter set of GMM frame by frame. Due to long-term speech absence, the on- line updation may fail. An on-line minimum description length (MDL) is presented to determine the long-term speechabsence/presence, which enables the model work well under long-term speech absence. The performance of the proposedmethod is evaluated by speech enhancement. The experimental results confirm GMM algorithm outperforms the typicalmethod such as classic MS and IMCRA algorithm.
出处 《声学学报》 EI CSCD 北大核心 2017年第5期633-640,共8页 Acta Acustica
基金 江西省教育厅科技项目(GJJ150681) 江西理工大学自然科学基金项目(NSFJ2015-G21) 国家重点基础研究发展计划项目(2013CB329302) 国家自然科学基金项目(61271426,10925419,90920302,61072124,11074275,11161140319,91120001) 中国科学院战略性先导科技专项(XDA06030100,XDA06030500) 中国科学院重点部署项目(KGZD-EW-103-2)资助
  • 相关文献

参考文献7

二级参考文献104

  • 1陈振标,徐波.基于子带能量特征的最优化语音端点检测算法研究[J].声学学报,2005,30(2):171-176. 被引量:22
  • 2陶智,赵鹤鸣,龚呈卉.基于听觉掩蔽效应和Bark子波变换的语音增强[J].声学学报,2005,30(4):367-372. 被引量:39
  • 3邵信光,杨慧中,陈刚.基于粒子群优化算法的支持向量机参数选择及其应用[J].控制理论与应用,2006,23(5):740-743. 被引量:127
  • 4Boll S. Suppression of acoustic noise in speech using spectral subtraction. IEEE Transactions on Acoustics, Speech, and Signal Processing, 1979; 27(2): 113-120.
  • 5Ephraim Y, Malah D. Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator. IEEE Transactions on Acoustics, Speech and Signal Processing, 1984; 32(6): 1109-1121.
  • 6Lockwood P, Boudy J. Experiments with a nonlinear spectral subtractor (NSS), hidden Markov models and projection for robust recognition in cars. Speech Communication, 1992; 11:215-228.
  • 7Virag N. Single channel speech enhancement based on masking properties of human auditory system. IEEE Transactions on Speech and Audio Processing. 1999; 7(2): 126-137.
  • 8Hansen J, Radhakrishnan V, Arehart K. Speech enhancement based on generalized minimum mean square error estimators and masking properties of the auditory system. IEEE Transactions on Audio, Speech and Language Processing, 2006; 14(6): 2049-2063.
  • 9Zavarehei E, Vaseghi S, Wan Q. Noisy speech enhancement using harmonic-noise model and codebook-based postprocessing. IEEE Transactions on Audio, Speech and Language Processing, 2007; 15(4): 1194-1203.
  • 10Hendriks R, Martin R. MAP Estimators for speech en- hancement under normal and Rayleigh inverse Gaussian distributions. IEEE Transactions on Audio, Speech and Language Processing, 2007; 15(3): 918-927.

共引文献35

同被引文献54

引证文献6

二级引证文献16

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部