期刊文献+

基于2D-Haar声学特征的大规模说话人识别方法

Large Scale Speaker Recognition Method that Uses 2D-Haar Acoustic Feature
下载PDF
导出
摘要 随着待识别人数的增加,文本无关的说话人识别准确率下降明显.针对这一问题提出了一种高准确率大规模说话人识别方法,该方法采用多个连续音频帧的声学帧特征构成声学特征图,进而获得高维度的2D-Haar声学特征,为训练出性能更优的分类器提供可能;再利用AdaBoost.MH算法筛选出具有较好区分度的2D-Haar声学特征组合进行分类器训练.实验结果表明,600人规模下的正确识别率为89.5%,100~600人规模下的平均准确率为91.3%.该方法适用于大规模说话人的识别,引入的2D-Haar声学特征有效,识别准确率高.此外,该方法还具有较低的算法复杂度和较高的时间效率. When we use the text-independent speaker recognition technology, the recognition accuracy degrades significantly as the number of target speakers increases. In order to improve the accuracy,a high accuracy large-scale speaker recognition method was proposed. This method combined certain number of continuous audio frames to be an acoustic feature figure, and then got the high-dimension 2D-Haar acoustic feature, which provide more probabilities to train a better classifier; AdaBoost. MH algorithm was employed to find out efficient 2D-Haar acoustic feature combination for classifier training. The experimental results show that recognition rate is 89.5% when the number of target speakers is 600, and average rate is 91.3% when the number of target speakers increases from 100 to 600. This method is suitable for large-scale speaker recognition and 2D-Haar acoustic feature is effective to yield higher performance. In addition, this method also has low algorithm complexity and time consumption.
出处 《北京理工大学学报》 EI CAS CSCD 北大核心 2014年第11期1196-1201,共6页 Transactions of Beijing Institute of Technology
基金 国家242计划基金资助项目(2005C48) 北京理工大学科技创新计划(2011CX01015)
关键词 说话人识别 2D-Haar声学特征 AdaBoost.MH speaker recognition 2D-Haar acoustic feature AdaBoost. MH
  • 相关文献

参考文献11

  • 1Chang H Y, Kong A L, Haizhou L. An SVM Kernel with GMM-supervector based on the Bhattacharyya distance for speaker recognition [J]. IEEE Signal Processing Letters, 2009,16(1) :49 - 52.
  • 2Liu M, Huang Z. Multi-feature fusion using multi- GMM supervector for SVM speaker verification[C]//Proceedings of 2nd International Congress on Image and Signal Processing. Tianjin, China: IEEE, 2009: 1 - 4.
  • 3Ajmera P K, Holambe R S. Fractional Fourier transform based features for speaker recognition using support vector machine[J]. Computers & Electrical En- gineering, 2013,39(2) : 550 - 557.
  • 4Zou M C. A novel feature extraction methods for speaker recognition [ J]. Communications and Information Processing, 2012,288 : 713 - 722.
  • 5Sapijaszko G I, Mikhael W B. An overview of recent window based feature extraction algorithms for speaker recognition[C]//Proceedings of IEEE 55th International Midwest Symposium on Circuits and Systems (MWSCAS). Boise, USA: IEEE, 2012:880-883.
  • 6Siu M H, Lang O, Gish H, et al. MLLR transforms of self-organized units as features in speaker recognition [C] // Proceedings of IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP). Kyoto, Japan: IEEE, 2012:4385 - 4388.
  • 7Magima R A, Doss M, Marcel S. Boosted binaryfeatures for noise-robust speaker verification [C] // Proceedings of IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP). Dallas, USA: IEEE, 2010:4442-4445.
  • 8Nishimura J, Kuroda T, Versatile recognition using Haar-like feature and cascaded classifier [J ]. IEEE Sensors Journal, 2010,10(5) : 942 - 951.
  • 9Nemati S, Basiri M E. Text-independent speaker verification using ant colony optimization-based selected features[J]. Expert Systems with Applications, 2011, 38(1) :620- 630.
  • 10Viola P, Jones M. Rapid object detection using a boosted cascade of simple features[C]//Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Kauai, Hawaii, USA: IEEE, 2001:1511 - 1518.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部