摘要
提出一种应用于体育节目音频分析中的关键词检索系统框架,并利用关键词检索结果对比赛实现了体育类别的自动判断。采用一种基于距离测算和基于模型选择融合的前端音频处理模块,实现了对复杂音频流中语音的高效提取;采用基于LVCSR系统的关键词系统框架,利用少量体育节目语音数据,对声学模型进行了自适应,构建体育类语言模型同时提出一种针对特定关键词词频分布的语言模型自适应,较大幅度提高了关键词系统的检出性能;针对不同体育比赛选择特征关键词,并利用关键词系统检索结果实现了比赛类别的自动判定,在由七种体育共15场比赛构成的测试集中,判定正确率达到100%。
This paper proposes a method to automatically recognize the sport type of sport games based on KWS (keyword spotting). In the front - end, we developed an audio segmentation module which can extract announcer's speech efficiently from the complex sport audio stream. By adopting the LVCSR - based keyword spotting framework, we employed acoustic model and language model adaptation for robust keyword spotting. In acoustic model adaptation, supervised adaptation is carried out using MAP. In language model adaptation, a word - frequency - based adaptation is proposed in this paper. Both these adaptations show prominent improvements on KWS performance, and the best performance is achieved when integrating both of them. Based on KWS results and specific keywords for each kind of sport, we achieve 100% accuracy in sport type determination experiment using 15 games of seven sports as the test set.
出处
《微计算机应用》
2009年第11期38-43,共6页
Microcomputer Applications
基金
国家高技术研究发展计划(863计划
2006AA010102)
国家科技支撑计划(2008BAI50B00)
国家重点基础研究发展规划项目计划(973计划
2004CB318106)
国家自然科学基金(No.10874203
60875014
60535030)经费资助
关键词
系统
音频分段
语言模型自适应
体育比赛分类
keyword spotting, audio segmentation, acoustic model adaptation, sport game