期刊文献+

基于播音员识别的新闻视频故事分割方法 被引量:4

Segmentation method of news video stories based on announcer identification
下载PDF
导出
摘要 新闻视频的语义单元分割是基于内容的新闻视频检索和情报挖掘的重要步骤,受到众多研究者的关注。提出了一种基于播音员识别的新闻视频故事单分割的新方法,首先从新闻节目中提取各播音员的声学感知特征的作为其声纹,训练出其相应的混合高斯模型(GMM),并采用KL差异法从视频镜头中探测出各播音员和非播音员音频镜头,最后结合视频字幕帧事件和新闻节目特殊的结构知识对新闻节目进行故事单元分割。在2个多小时的CCTV和CNN新闻视频实验中获得96.02%查准率和92.58%的查全率。 As an important step of content based news video retrieving and information mining,semantic unit segmentation has attracted many researchers' interests.This paper focuses on a new method of news video stories segmentation which is based on the announcer identification.Firstly,the voiceprints including acoustic perception characteristics of each announcer are extracted, and their Gaussian mixture models are trained,then the audio shots of announcer and not-announcer are detected by the KL divergence method,at last the unit segmenting is carried on under the guidance of video topic caption frames and special structure knowledge of news program.Finally the 92.58% recall and the 96.02% precision are achieved during more than 2 hours' experiment.
出处 《计算机工程与应用》 CSCD 北大核心 2008年第19期4-7,共4页 Computer Engineering and Applications
基金 国家自然科学基金(the National Natural Science Foundation of China under Grant No.60243006) 国家教育部博士点基金(No.20069998022)
关键词 播音员声纹 故事单元分割 高斯混合模型 新闻视频 voiceprint story unit segmentation Gaussian mixture model news video
  • 相关文献

参考文献9

  • 1Guide lines for the TRECVID 2004 evaluation[EB/OL].(2005-02-17). http ://www-nlpir.nist.gov/proj ects/tv2004/tv2004.html#2.2.
  • 2Hoashi K,Sugano M,Naito M,et al.Shot boundary determination on MPEG compressed domain and story segmentation experiments for TRECVID 2004[C]//TREC Video Retrieval Evaluation Foruin,2004.
  • 3Hsu W,Chang S F.Generative,discriminative,and ensemble learning on multi-model perceptual fusion toward news video story segmentation[C]//International Conference on Multimedia and Expo,2004.
  • 4Chaisorn L,Chua T S,Lee C H.The segmentation of news video into story units[C]//International Conference on Multimedia and Expo, 2002.
  • 5Zhai Y,Yillnaz A,Shah M.Story segmentation in news videos using visual and text cues[C]//Leow W K.LNCS 3568:CIVR 2005,2005: 92-102.
  • 6Rabiner L, Juang B H.Fundamentals of speech recognition[M].Englewood Cliffs, NJ: Prentice Hall, 1993.
  • 7Reynolds D A,Rose R C.Robust text-independent speaker identification using Gaussian mixture speaker models[J].IEEE Trans Speech and Audio ProceSsing, 1995,3 ( 1 ) : 72-83.
  • 8Reynolds D A.Speaker identification and verification using Gaussian mixture speaker models[J],Speech Communication, 1995,17 (1/2) : 91-108.
  • 9Cover T M,Tomas J A.Elements of information theory[M],USA: John Wiley & Sons, 1991:18-19.

同被引文献33

引证文献4

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部