期刊文献+

基于隐马尔可夫链的音频语义检索 被引量:10

HIDDEN MARKOVIA MODEL BASED AUDIO SEMANTIC RETRIEVAL
原文传递
导出
摘要 作为多媒体媒质之一的音频信号蕴涵了丰富的视觉听觉语义,但是目前多媒体检索主要利用的是视觉信息,音频信息被忽略。为了弥补这一不足,本文介绍了一个音频语义检索原型系统,在这个系统中,音频信号被分层次处理:首先分析音频信息中的短时能量、过零率和基本频率能量比等特征,音频信息流被接层次粗分为静音、和谐音乐、对话和环境背景音四类;由于环境背景音蕴涵了大量语义,环境背景音被继续细分,井用训练好的隐马尔可夫链表示每类环境背景音以进行语义检索。实验数据表明,这样的音频查询处理方式取得了良好效果。 As one component in multimedia, audio contains rich audiovisual semantic information. However, current multimedia retrieval mostly uses visual information without audio information. In this paper an audio semantic retrieval prototype system is presented, in which audio stream is hierarchically handled. First, depending on audio characteristics such as short-time energy, zero-crossing rate and fundamental frequency energy ratio, audio stream is coarsely segmented into four basic classes: silence, harmonic music, dialog and environmental sounds. Then, hidden Markov model (HMM) is used to perform fine-level segmentation for environmental sounds which have mary implied semantics. At the same time, the trained HMM is used to denote each type of environmental sound for semantic retrieval. Experimental data show this audio retrieval method works well.
机构地区 浙江大学
出处 《模式识别与人工智能》 EI CSCD 北大核心 2001年第1期104-108,共5页 Pattern Recognition and Artificial Intelligence
基金 国家自然科学基金 教育部优秀年轻教师基金 高等学校骨干教师资助计划资助项目
关键词 隐马尔可夫链 音频语义检索 音频信号处理 多媒体 Hierarchical Segmentation, Hidden Markov Model, Audio Retrieval
  • 相关文献

参考文献1

二级参考文献3

  • 1Yong Rui,Proc IEEE Conf on Multinedia Computing and Systems,1998年,54页
  • 2Zhuang Yueting,Proc IEEE Int Conf on Image Proc,1998年,76页
  • 3Zhang Hongjiang,Pattern Recognition,1997年,30卷,4期,643页

共引文献5

同被引文献89

  • 1刘彦伟,刘明举,武刚生.鹤煤十矿突出前瓦斯涌出特征及预测指标的选择与应用[J].煤矿安全,2005,36(11):18-21. 被引量:6
  • 2LUF, ZHUANG Y T, WU F, et al. 3D Motion retrieval with motion index tree [J]. Computer Vision and Image Understanding, 2003, 92(2) :265 - 284.
  • 3XIAO J, ZHUANG Y T, WU F. Getting distinct movements from motion capture data [C]// Proceedings of CASA 2006. Geneva: Wiley, 2006:33 - 42.
  • 4WANG Y, LIU Z Q, ZHOU L Z. Key-styling: learning motion style for real-time synthesis of 3D animation [J]. Computer Animation and Virtual Worlds, 2006, 17 ( 3 ) : 229 - 237.
  • 5SHAPIRO A, CAO Y, FALOUTSOS P. Style components [C]//Proceedings of GI 2006. Quebec City: ACM, 2006.. 33-39.
  • 6BRAND M, HERTZMANN A. Style machine[C]// Proceedings of SIGGRAPH 2000. New Orleans: ACM, 2000:183 - 192.
  • 7ROSE C, BODENHEIMER R, COHEN M F. Verbs and adverbs: multidimensional motion interpolation using radial basis functions [J]. IEEE Computer Graphics and Applications, 1998, 18(5) :32 - 48.
  • 8MARIMONT D, WANDELL B. Linear models of surface and illumination spectra [J]. Journal of Optical Society of America, 1992, 9 : 1905 - 1913,
  • 9VASILESCU M A O, TERZOPOULOS D. Multilinear analysis of image ensembles: Tensorfaees [C] //Proceedings of ECCV 2002. Copenhagen: Springer, 2002: 447 -460.
  • 10XIAO J, ZHUANG Y T, CHEN C, et al. Automatic synthesis and editing of motion styles [C]//Proceedings of CIDE 2006. Jinan: Shandong University, 2006: 311- 315.

引证文献10

二级引证文献42

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部