期刊文献+

基于GMM和二分类特征筛选的多级音频分类方法

Audio classification based on GMM and feature filtration for hierarchical discriminators
下载PDF
导出
摘要 采用同一种特征参数——Mel倒谱系数及其动态参数区分纯语音、带背景语音、乐器音、歌声和环境音.根据该特征参数的特点以及各类音频之间的差异,给出了一种区分性模型训练和特征筛选相结合的多级二分类音频分类方法,即为各级建立GMM(Gaussian mixture model)模型的同时挑选出使当前模型区分程度达到最大的特征子集.对长约2 h的音频数据集的测试结果表明,该方法相对于特征筛选前的分类系统,平均误识率下降了约23.5%,且各二分类子系统的特征维数也有明显地减少. MFCC and its dynamics were used to distinguish pure speech, impure speech, instrument sounds, songs, and environment sounds. Considering the characteristics of such features and differences between audio types, a hierarchical discrimination algorithm was proposed based on discriminative model training and feature filtration, which trained GMMs (Gaussian mixture models) in each layer and selected the feature subset resulting in maximal separability for them. Within about 2-hour-long database, experimental results indicate that the algorithm outperforms original 90-dimension system by 23. 5% in average error rate, as well as obtains a substantial dimensionality reduction for discriminator every layer.
出处 《中国科学技术大学学报》 CAS CSCD 北大核心 2007年第12期1466-1471,共6页 JUSTC
关键词 音频分类 Mel倒谱系数及其动态参数 区分性模型训练 特征筛选 多级二分类方法 audio classification MFCC and its dynamics discriminative model training feature filtration hierarchical discrimination
  • 相关文献

参考文献10

  • 1Zhang T, Kuo C J. Audio content analysis for online audiovisual data segmentation and classification [J]. IEEE Transactions on Speech and Audio Processing, 2001, 9(4): 441-457.
  • 2Lu L, Jiang H, Zhang H J. A robust audio classification and segmentation method[C]// Proceedings of the 9th ACM International Conference on Multimedia. New York: ACM Press, 2001, 9: 203-211.
  • 3Lu L, Zhang H J, Jiang H. Content analysis for audio classification and segmentation [J]. IEEE Transactions on Speech and Audio Processing, 2002, 10 (7):504-516.
  • 4Li S Z. Content-based classification and retrieval of audio using the nearest feature line method[J]. IEEE Transactions on Speech and Audio Processing, 2000, 8(5): 619-625.
  • 5Lu L, Zhang H J, Li S. Content-based audio classification and segmentation by using vector machines[J]. Multimedia Systems, 2003, 8 (6):482- 492.
  • 6Novovicov J, Pudil P, Kittler J. Divergence based feature selection for multimodal class densities [J]- IEEE Transactions on Pattern Analysis and Machine Intelligence, 1996, 18(2): 218-223.
  • 7李霄寒,戴蓓倩,方绍武,刘鸣.高阶MFCC的话者识别性能及其噪声鲁棒性[J].信号处理,2001,17(2):124-129. 被引量:14
  • 8边肇祺,张学工.模式识别[M].2版.北京:清华大学出版社,2003.
  • 9Dempster A P, Laird N M, Rubin D B. Maximum likelihood from incomplete data via the EM algorithm [J]. Journal of Royal Statistical Society, Series B, 1977, 39(1):1-38.
  • 10Abe N, Kudo M, Toyama J, et al. A divergence criterion for classifier-independent feature selection [C]// Proceedings of the Joint IAPR International Workshops on Advanced in Pattern Recognition. Alicante, Spain: Springer-verlag, 2000: 668-676.

二级参考文献5

  • 1Yao Kaisheng,EUROSPEECH',1999年,6卷,2873页
  • 2You Kuohwei,ICASSP,1998年,577页
  • 3杨行峻,语音信号数字处理,1995年
  • 4徐文盛,电路与系统学报,4卷,4期,19页
  • 5徐文盛,戴蓓倩,方绍武,李辉.基于连续HMM的孤立语音鲁棒性识别方法[J].电路与系统学报,1999,4(4):19-23. 被引量:5

共引文献15

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部