期刊文献+

与文本无关的单训练样本说话人识别

Text-independent speaker identification approach using single training sample
下载PDF
导出
摘要 针对现有的说话人识别方法对环境噪声较为敏感的问题,提出了一种与文本无关的单训练样本说话人识别方法。该方法能够提取语音时频谱的局部特征,此特征不仅对白噪声、高斯噪声、粉红噪声有很强的鲁棒性,而且能够充分反映说话人的基本发声特性。针对该局部特征的基本特点,给出了适合该局部特征的贝叶斯决策方法。对英文与汉语语音数据库的仿真实验表明,该识别方法可以实现单训练样本下的说话人识别,识别精度明显高于现有的Mel频率倒谱系数(MFCC)与线性预测编码(LPCC)语音特征,而且对白噪声等各类环境噪声有较高的鲁棒性。 In order to alleviate the limitation that the existing speaker identification methods are sensitive to noisy and environmental sounds.A novel robust text-independent speaker identification approach using single training sample is proposed.In such method,the main frequency components of an acoustic signal are determined in time-frequency domain,and then their local distributions and variations in time-frequency domain are obtained and regarded as the acoustic local features.These local features are not only robust to white noise and pink noise,and invariant to the intensity of the acoustic signal,but also reflect a person′s inherent phonation characteristic.A Bayesian decision classifier for these acoustic local features have been introduced.Experimental results on speech databases in English and Chinese demonstrate that the proposed approach can implement speaker identification based on single training sample,and yields a better performance in terms of the correct classification percentages compared with the conventional acoustic features such as linear predictive coding cepstral(LPCC)coefficients and mel-frequency cepstral coefficients(MFCC).It is also shown that the proposed approach yields significantly high tolerances to white noise,pink noise and environmental sounds.
作者 郭建敏 王晅 GUO Jianmin WANG Xuan(School of Physics and Information Technology, Shaanxi Normal University, Xi'an 710119, Shaanxi, China)
出处 《陕西师范大学学报(自然科学版)》 CAS CSCD 北大核心 2016年第5期33-38,共6页 Journal of Shaanxi Normal University:Natural Science Edition
基金 国家自然科学基金(61373083) 陕西省自然科学基金(2009JM8003)
关键词 说话人识别 时频局部特征 线性预测编码 MEL频率倒谱系数 贝叶斯决策 speaker recognition time-frequency local features linear predictive coding cepstral Mel-frequency cepstral coefficients Bayesian decision
  • 相关文献

参考文献20

  • 1VAJARIA H, ISLAM T, MOHANTY P,et al. Evalu- ation and analysis of a face and voice outdoor multi-bio- metric system[J]. Pattern Recognition Letters, 2007, 28 (12) : 1572-1580.
  • 2ABHYANKAR A, SCHUCKERS S. A novel biorthog- onal wavelet network system for off-angle is recognition [J]. Pattern Recognition, 2010, 43(3): 987-1007.
  • 3PANKANTI S, PRABHAKAR S, JAIN A K. On the individuality of fingerprints[J]. IEEE Transactions on Pattern Analyze Machine Intelligent, 2002, 24 ( 8 ) : 1010-1025.
  • 4WANG X, LEI L. Palm print verification based on 2D Gabor wavelet and pulse-coupled neural network[J]. Knowledge-Based Systems, 2012, 27(5) :451-455.
  • 5WANG X, YANG T F, YU Y, et al. Footstep identifi- cation system based on walking interval[J]. IEEE Intel- ligent Systems, 2015, 30(2) :46-52.
  • 6REYNOLDS D A, QUATIERI T F, DUNN R B. Speaker verification using adapted Gaussian mixture models[J]. Digital Signal Process, 2000, 10(6) :19-41.
  • 7XIANG B, BERGER T. Efficient text-independent speaker verification with structural Gaussian mixture models and neural network[J]. IEEE Transactions on Speech and Audio Processing, 2003, 11(5):786-789.
  • 8LIP Q, ZHENG J, TSAI A, et al. Robust end-point detection and energy normalization for real-time speech and speaker recognition [J ]. IEEE Transactions on Speech Audio Process, 2002, 10(3): 146-157.
  • 9KENNY P, BOULIANNE G, OUELLET P, et al. Joint factor analysis versus eigen channels in speaker recognition[J]. IEEE Transsactions on Audio Speech Language Process, 2007, 15(4): 1435-1447.
  • 10KINNUNEN T, LI H Z. An overview of text-inde- pendent speaker recognition from features to super vec- tors[J]. Speech Communication, 2010, 52 :12-40.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部