基于2D-Haar声学特征的大规模说话人识别方法

Large Scale Speaker Recognition Method that Uses 2D-Haar Acoustic Feature

下载PDF

导出

摘要随着待识别人数的增加,文本无关的说话人识别准确率下降明显.针对这一问题提出了一种高准确率大规模说话人识别方法,该方法采用多个连续音频帧的声学帧特征构成声学特征图,进而获得高维度的2D-Haar声学特征,为训练出性能更优的分类器提供可能;再利用AdaBoost.MH算法筛选出具有较好区分度的2D-Haar声学特征组合进行分类器训练.实验结果表明,600人规模下的正确识别率为89.5%,100~600人规模下的平均准确率为91.3%.该方法适用于大规模说话人的识别,引入的2D-Haar声学特征有效,识别准确率高.此外,该方法还具有较低的算法复杂度和较高的时间效率. When we use the text-independent speaker recognition technology, the recognition accuracy degrades significantly as the number of target speakers increases. In order to improve the accuracy,a high accuracy large-scale speaker recognition method was proposed. This method combined certain number of continuous audio frames to be an acoustic feature figure, and then got the high-dimension 2D-Haar acoustic feature, which provide more probabilities to train a better classifier; AdaBoost. MH algorithm was employed to find out efficient 2D-Haar acoustic feature combination for classifier training. The experimental results show that recognition rate is 89.5% when the number of target speakers is 600, and average rate is 91.3% when the number of target speakers increases from 100 to 600. This method is suitable for large-scale speaker recognition and 2D-Haar acoustic feature is effective to yield higher performance. In addition, this method also has low algorithm complexity and time consumption.

作者谢尔曼罗森林潘丽敏

机构地区北京理工大学信息系统及安全对抗实验中心

出处《北京理工大学学报》 EI CAS CSCD 北大核心 2014年第11期1196-1201,共6页 Transactions of Beijing Institute of Technology

基金国家242计划基金资助项目(2005C48) 北京理工大学科技创新计划(2011CX01015)

关键词说话人识别 2D-Haar声学特征 AdaBoost.MH speaker recognition 2D-Haar acoustic feature AdaBoost. MH

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献11

1Chang H Y, Kong A L, Haizhou L. An SVM Kernel with GMM-supervector based on the Bhattacharyya distance for speaker recognition [J]. IEEE Signal Processing Letters, 2009,16(1) :49 - 52.
2Liu M, Huang Z. Multi-feature fusion using multi- GMM supervector for SVM speaker verification[C]//Proceedings of 2nd International Congress on Image and Signal Processing. Tianjin, China: IEEE, 2009: 1 - 4.
3Ajmera P K, Holambe R S. Fractional Fourier transform based features for speaker recognition using support vector machine[J]. Computers & Electrical En- gineering, 2013,39(2) : 550 - 557.
4Zou M C. A novel feature extraction methods for speaker recognition [ J]. Communications and Information Processing, 2012,288 : 713 - 722.
5Sapijaszko G I, Mikhael W B. An overview of recent window based feature extraction algorithms for speaker recognition[C]//Proceedings of IEEE 55th International Midwest Symposium on Circuits and Systems (MWSCAS). Boise, USA: IEEE, 2012:880-883.
6Siu M H, Lang O, Gish H, et al. MLLR transforms of self-organized units as features in speaker recognition [C] // Proceedings of IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP). Kyoto, Japan: IEEE, 2012:4385 - 4388.
7Magima R A, Doss M, Marcel S. Boosted binaryfeatures for noise-robust speaker verification [C] // Proceedings of IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP). Dallas, USA: IEEE, 2010:4442-4445.
8Nishimura J, Kuroda T, Versatile recognition using Haar-like feature and cascaded classifier [J ]. IEEE Sensors Journal, 2010,10(5) : 942 - 951.
9Nemati S, Basiri M E. Text-independent speaker verification using ant colony optimization-based selected features[J]. Expert Systems with Applications, 2011, 38(1) :620- 630.
10Viola P, Jones M. Rapid object detection using a boosted cascade of simple features[C]//Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Kauai, Hawaii, USA: IEEE, 2001:1511 - 1518.

1成新民,张迎,蒋云良.基于FVQMM的说话人识别[J].辽宁工程技术大学学报（自然科学版）,2007,26(5):719-722.
2蒋宗礼,徐学可.一种基于集成学习与类指示器的文本分类方法[J].北京工业大学学报,2010,36(4):546-553. 被引量：3
3谢尔曼,罗森林,潘丽敏.2D-Haar声学特征超向量快速生成方法[J].北京理工大学学报,2016,36(3):295-301.
4朱宇轩.浅谈说话人识别方法[J].西部皮革,2016,38(10):19-19.
5朱学芳,黄奇,马仁配.基于语音识别的用户认证系统设计及其在电子商务中的应用[J].情报科学,2007,25(8):1223-1226. 被引量：3
6杨得国,王荣萍.基于音频特征的自适应数字盲音频水印算法[J].江西师范大学学报（自然科学版）,2014,38(1):108-110.
7王庆岭,冯德成.基于小波域的自适应数字水印算法的研究[J].自动化与仪器仪表,2015(1):13-16 19. 被引量：6
8郑泽萍,王万良,郑建炜.基于保局部核RVM的说话人识别方法[J].计算机工程,2011,37(14):208-210. 被引量：1
9董乐红,耿国华,周明全.基于Boosting算法的文本自动分类器设计[J].计算机应用,2007,27(2):384-386. 被引量：13
10武妍,金明曦,王守觉.基于仿生模式识别理论的高阶神经网络说话人识别方法[J].计算机工程,2006,32(12):184-186. 被引量：2

北京理工大学学报

2014年第11期

浏览历史

内容加载中请稍等...

基于2D-Haar声学特征的大规模说话人识别方法

参考文献11

相关作者

相关机构

相关主题

浏览历史