摘要
提出一种有效地融合多模态信息来检测足球视频精彩事件的系统算法和框架。首先从视频中抽取音频流,然后基于CHMM进行音频分类。接着根据时间对应关系在包含激昂解说音和欢呼声的相邻镜头里结合球门和慢镜头检测射门事件,其中慢镜头检测是基于徽标的。对射门事件进一步根据激昂解说音和欢呼声的长短、慢镜头的长短及比分字幕的出现检测进球事件。在哨音出现的相邻镜头中结合是否有慢镜头回放及回放长度来检测犯规事件。实验表明,提出的系统算法及框架是高效率的。
This paper proposed a framework to fuse multimodal features to detect soccer highlights. First the audio stream was extracted from video and classified based on CHMM. Then according to time corresponding relationship, shoot event was detected based on the combination of goal and replay in the shots near to those including excited speech of commenter and cheer from audience, where replay was detected based on logos. For shoots scoring could be judged according to the length of excited speech and cheer and the one of replay and the caption appearance. In the shots close to those including whistles fouls could be detected based on the combination of replay appearance and the length of replay. Experiments prove the high efficiency of the proposed system.
出处
《计算机科学》
CSCD
北大核心
2010年第7期273-276,共4页
Computer Science
基金
南京理工大学科技发展基金(XKF09023)资助
关键词
多模态融合
音频分类
徽标
慢镜头
球门
Fusion of multimodal features, Audio classification, Logo, Slow-motion replay, Goal