期刊文献+

一种应用于体育节目中的关键词检测与比赛分类系统

Sport Type Determination for Sport Games by Keyword Spotting
下载PDF
导出
摘要 提出一种应用于体育节目音频分析中的关键词检索系统框架,并利用关键词检索结果对比赛实现了体育类别的自动判断。采用一种基于距离测算和基于模型选择融合的前端音频处理模块,实现了对复杂音频流中语音的高效提取;采用基于LVCSR系统的关键词系统框架,利用少量体育节目语音数据,对声学模型进行了自适应,构建体育类语言模型同时提出一种针对特定关键词词频分布的语言模型自适应,较大幅度提高了关键词系统的检出性能;针对不同体育比赛选择特征关键词,并利用关键词系统检索结果实现了比赛类别的自动判定,在由七种体育共15场比赛构成的测试集中,判定正确率达到100%。 This paper proposes a method to automatically recognize the sport type of sport games based on KWS (keyword spotting). In the front - end, we developed an audio segmentation module which can extract announcer's speech efficiently from the complex sport audio stream. By adopting the LVCSR - based keyword spotting framework, we employed acoustic model and language model adaptation for robust keyword spotting. In acoustic model adaptation, supervised adaptation is carried out using MAP. In language model adaptation, a word - frequency - based adaptation is proposed in this paper. Both these adaptations show prominent improvements on KWS performance, and the best performance is achieved when integrating both of them. Based on KWS results and specific keywords for each kind of sport, we achieve 100% accuracy in sport type determination experiment using 15 games of seven sports as the test set.
出处 《微计算机应用》 2009年第11期38-43,共6页 Microcomputer Applications
基金 国家高技术研究发展计划(863计划 2006AA010102) 国家科技支撑计划(2008BAI50B00) 国家重点基础研究发展规划项目计划(973计划 2004CB318106) 国家自然科学基金(No.10874203 60875014 60535030)经费资助
关键词 系统 音频分段 语言模型自适应 体育比赛分类 keyword spotting, audio segmentation, acoustic model adaptation, sport game
  • 相关文献

参考文献8

  • 1Zhang, D. and Ellis, D.. Detecting sound events in basketball video archive. Dept. Electronic Eng. , Columbia Univ. , 2001.
  • 2Xiong, Z. and Radhakrishnan, R. and Divakaran. Highlights extraction from sports video based on an audio - visual marker detection framework. IEEE International Conference on Multimedia and Expo, 2005.
  • 3Kemp, T. and Schmidt, M. and Westphal, M. and Waibel, A. Strategies for automatic segmentation of audio data. ICASSP00. Proceedings 2000, 42(3) : 391 - 408.
  • 4Lu, L. and Zhang, H. J. Real -time unsupervised speaker change detection. Pattern Recognition, Proceedings. 16th International Conference 2002, 3.
  • 5Lu, L and Zhang, H.J. and Li, S.Z. Content - based audio classification and segmentation by using support vector machines. Multimedia Systems, 2003, 6(8) : 482-492.
  • 6SHAO, J. and LI, T. and ZHANG, Q. and ZHAO, Q. and YAN, Y,A One 2 Pass Real- Time Decoder Using Memory -Efficient State Network. IEICE TRANSACTIONS on Information and Systems. 2008, 91(3):529.
  • 7Stolcke, A. SRILM - an extensible language modeling toolkit. Seventh International Conference on Spoken Language Processing, 2002.
  • 8Falavigna, D. and Gretter, R. and Riccardi, G. Acoustic and word lattice based algorithms for confidence scores. Seventh International Conference on Spoken Language Processing, 2002.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部