

Audio-based digital multimedia content analysis and its visualization
摘要 为了对音视频内容进行更加有效地分析,将信息可视化方法引入数字媒体信息处理领域。设计并实现了集多媒体信号采集、大词表连续语音识别、文本检索和音频检索为一身的多媒体内容可视化分析平台,取得了较理想的效果,充实了信息可视化理论并对其具体应用进行了有益尝试。 To facilitate the content analysis of audio and video, information visualization methods are applied to digital multimedia processing. A multimedia content visualization analysis system is designed and constructed including multimedia signal collection, large vocabulary continuous speech recognition, text retrieval and audio retrieval, which is a supplement to the theory of information visualization and beneficial to its application. The experimental results are rather good.
出处 《燕山大学学报》 CAS 2010年第2期100-105,共6页 Journal of Yanshan University
基金 国家自然科学基金资助项目(60772076) 语言语音教育部-微软重点实验室开放基金资助项目(HIT.KLOF.2009015)
关键词 数字媒体内容 信息可视化 语音识别 文本检索 音频检索 digital multimedia content information visualization speech recognition text retrieval audio retrieval
  • 相关文献


  • 1Allen,B,窦平安.图书情报学研究中的内容分析法[J].国外情报科学,1993,11(1):27-30. 被引量:16
  • 2Robertson G,Card S K,Mackinlay J D.The cognitive coprocessor architecture for interactive user interfaces[C] //Proceedings of the 2nd Annual ACM SIGGRAPH Symposium on User Interface Software and Technology,Williamsburg,Virginia,United States,1989:10-18.
  • 3Huang XD,Acero A,Hon H W.Spoken language processing:a guide to theory,algorithm and system development[M].New Jersey:Prentice Hall PTR,2001.
  • 4倪崇嘉,刘文举,徐波.汉语大词汇量连续语音识别系统研究进展[J].中文信息学报,2009,23(1):112-123. 被引量:39
  • 5Reynolds D A.Automatic speaker recognition using Gaussian mixture speaker models[J].MIT Lincoln Laboratory Journal,1995,8 (2):173-191.
  • 6Rabiner L R.A tutorial on hidden Markov models and selected applications in speech recognition[C] //Procedings of IEEE,1989,77 (2):257-286.
  • 7Huang C,Shi Y,Zhou J L,et al..Segmental tonal modeling for phone set design in Mandarin LVCSR[C] //Proceedings of IEEE International Conference on Acoustics,Speech,and Signal Processing,Montreal,Quebec,Canada,2004:901-904.
  • 8Young S,Russell NH,Thornton J H S.Token passing:a simple conceptual model for connected speech recognition systems[R].Cambridge:Cambridge University Engineering Departmet,1989.
  • 9Sakoe H,Chiba S.A similarity evaluation of speech patterns by dynamic programming[C] //the Dig.1970 Nat.Meeting,Institute of Electronic Communications Engineering of Japan,1970:136.
  • 10Young S,Evermann G,Hain T,et al..The HTK Book (for HTK 3.3)[M/OL].Cambridge University Engineering Department,2005.


  • 1孔燕,葛列众.突显及其工效学研究[J].人类工效学,1999,5(3):40-42. 被引量:10
  • 2钱跃良,林守勋,刘群,刘宏.2005年度863计划中文信息处理与智能人机接口技术评测回顾[J].中文信息学报,2006,20(B03):1-6. 被引量:4
  • 3Zhang, B., S. Matsoukas and R. Schwartz. Discrimina tively trained region dependent teature transforms for speech recognition [C]// Proc. ICASSP, Vol. 1-13, 2006: 313-316.
  • 4Beyerlein, P., et al., Large vocabulary continuous speech recognition of Broadcast News - The Philips/ RWTH approach[J]. Speech Communication, 2002, 37(1-2): 109- 131.
  • 5Hain, T., et al., Automatic transcription of conversational telephone speech [C]// IEEE Transactions on Speech and Audio Processing, 2005, 13(6): 1173-1185.
  • 6Zhang, B. and S. Matsoukas, Minimum phoneme error based heteroscedastic linear discriminant analy sis for speech recognition[C]// Proc. ICASSP, Vol. 1-5, 2005: 1925-1928.
  • 7Hirsimaki, T., et al., Unlimited vocabulary speech recognition with morph language models applied to Finnish[J]. Computer Speech and Language, 2006, 20(4) : 515-541.
  • 8Odell, J.J., The Use of Context in Large Vocabulary Speech Recognition[D]. 1995, University of Cambridge :Cambridge
  • 9Young, S.J., J.J. Odell, and P. C. Woodland. Tree-Based State Tying for High Accuracy Modelling [C]// Proceedings ARPA Workshop on Human Language Technology. 1994.
  • 10Xu, B., et al., Integrating tone information in continuous Mandarin recognition[C]// Proc. ISSPIS, 1999.









使用帮助 返回顶部