期刊文献+

2D-Haar声学特征超向量快速生成方法

2D-Haar Acoustic Super Feature Vector Fast Generation Method
下载PDF
导出
摘要 针对大数据量音频的高速处理,提出一种快速的声学特征超向量生成方法,有效提高音频识别系统的识别速度和精度.所提方法首先将多个连续音频帧的常用声学特征构成声学特征图,进而使用低复杂度的运算方法在其中快速提取维数达数十万的Haar-like声学特征;然后使用AdaBoost.MH算法,筛选出具有较高代表性的Haarlike声学特征模式组合,用以构成声学特征超向量;进而提出Random AdaBoost特征筛选方法,进一步提高特征筛选速度.实验结果表明,在音频事件识别、说话人识别、说话人性别识别3种场合下,使用Haar-like声学特征可以使SVM、C5.0、AdaBoost等识别算法获得比MFCC、PLP、LPCC等常用声学特征更高的识别准确率,同时可以获得7~20倍的训练速度提升和5~10倍的识别速度提升. A fast and efficient acoustic feature super vector generation method was proposed to effectively improve the recognition accuracy and speed yielded by traditional frame based acoustic features.This paper makes 3contributions:firstly,certain number of acoustic feature vectors extracted from continuous audio frames was combined to be an acoustic feature image;secondly,AdaBoost.MH algorithm was used to select higher representative 2D-Haar pattern combinations to construct super feature vectors;thirdly,random feature selection method was proposed to further improve the processing speed.Experimental results show that under 3kinds of audio recognition occasions such as audio events recognition,speaker recognition,speaker gender recognition,the use of 2D-Haar acoustic feature super vector can make SVM,C5.0,AdaBoost algorithms obtain higher recognition accuracy than ones that MFCC,PLP,LPCC and other traditional acoustic features yielded,and can make the training processing 7~20times faster and the recognition processing 5~10times faster.
出处 《北京理工大学学报》 EI CAS CSCD 北大核心 2016年第3期295-301,共7页 Transactions of Beijing Institute of Technology
基金 国家"二四二"计划项目(2005C48) 北京理工大学科技创新计划项目(2011CX01015)
关键词 音频处理 音频识别 2D-Haar声学特征超向量 Haar-like声学特征 AdaBoost.MH audio processing audio recognition 2D-Haar feature super vector 2D-Haar acoustic feature AdaBoost.MH
  • 相关文献

参考文献1

二级参考文献9

  • 1Chu W, Cheng W, HsuJ Y, et al. Toward semantic indexing and retrieval using hierarchical audio models [J]. Multimedia Systems, 2005,10(6) :570 - 583.
  • 2Li Q, Ma H D, Zhao D. A neural network based framework for audio scene analysis in audio sensor networks[J]. Advances in Multimedia Information Pro- cessing, 2009,5879 : 480 - 490.
  • 3Bugalho M, Portelo J, Trancoso I, et al. Detecting audio events for semantic video search[C] // Proceedings of 10th Annual Conference of the International Speech Communication Association. Brighton, United Kingdom: [s. n.], 2009..1147 - 1150.
  • 4Schapire R E. The strength of weak learnability[J]. Machine Learning, 19 9 0,5 (2) : 197 - 227.
  • 5Freund Y. Boosting a weak learning algorithm by majority [J]. Information and Computation Information Computer, 1995,12(2) :256 - 285.
  • 6Freund Y, Schapire R E. A decision-theoretic generalization of on-line learning and an application to boosting[J]. Journal of Computer and System Science, 1997(1) :119 - 139.
  • 7Viola P, Jones M. Rapid object detection using a boosted cascade of simple features[C] // Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Los Alamitos, USA: [s. n. ], 2001:511 - 518.
  • 8Schapire R E, Singer Y. Improved boosting algorithms using confidence-rated predictions [J]. Machine Learning, 1999,37(3) :297 - 336.
  • 9Pikrakis A, Giannakopoulos T, Theodoridis S. Gunshot detection in audio streams from movies by means of dynamic programming and Bayesian networks[C]//Pro-ceedings of IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway, USA: [s. n.], 2008:21-24.

共引文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部