期刊文献+

汉语口语对话中姿态与语音信息关系初探

Relationships between gestures and speech in spontaneous Chinese speech
原文传递
导出
摘要 信息交互方式多种多样,以语音和姿态的表达最为自然,因此提高人机交互能力就需了解交际过程中的这2种模态对信息表达之间的关系。该文介绍了语音与姿态关系的相关理论和产生模型,并以电视访谈节目中自然对话的视频和音频数据为研究对象,对汉语普通话语音和姿态信息在交际过程中的关系进行了初步的研究。在语音学和姿态标注的基础上,分析了口语对话中焦点重音与姿态动作之间的关系,以及韵律边界和姿态边界之间的关系。研究发现语音上重音表达往往伴随较强烈的手部动作,而且此时手和头部动作之间有互补的现象;韵律边界和姿态边界没有时间上的对应关系,但有很大的相关性,这些结果都支持语音与姿态表达之间的关联理论。 Although humans communicate in various ways,the most natural expressions are related to speech and gestures.This paper describes a pilot study of the relationship between the two modalities of speech and gesture for Chinese spontaneous speech to improve the interactive capability of human computer interaction systems(HCI).The paper uses a speech and gesture production model with a multimodal coding scheme to annotate four video and audio clips.The speech stress is then correlated with the hand gesture ampli...
出处 《清华大学学报(自然科学版)》 EI CAS CSCD 北大核心 2008年第S1期627-634,共8页 Journal of Tsinghua University(Science and Technology)
基金 国家"八六三"高技术项目(2006AA01Z138)
关键词 自然口语 姿态 语音 多模态 spontaneous speech gesture speech multimodal
  • 相关文献

参考文献2

二级参考文献10

  • 1[1]Cohen MM, Massaro DW. Modeling coarticulation in synthetic visual speech. In: Thalmann NM, Thalmann D, eds. Models Techniques in Computer Animation. Tokyo: Springer-Verlag, 1993. 139~156.
  • 2[2]Reveret L, Bailly G, Badin P. Mother: a new generation of talking heads providing a flexible articulatory control for video-realistic speech animation. In: Yuan Bao-Zong, Huang Tai-Yi, Tang Xiao-Fang, eds. Proceedings of the 6th International Conference on Spoken Language Processing (Ⅱ). Beijing: China Military Friendship Publish, 2000. 755~758.
  • 3[3]Brooke NM, Scott SD. Computer graphics animations of talking faces based on stochastic models. In: International Symposium on Speech, Image Processing and Neural Networks. 1994. 73~76.
  • 4[4]Masuko T, Kobayashi T, Tamura M. Text-to-Visual speech synthesis based on parameter generation from HMM. In: Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing (Ⅵ). 1998. 3745~3748.
  • 5[5]Bregler C, Covell M, Slaney M. Video rewrite: driving visual speech with audio. In: Proceedings of the ACM SIGGRAPH Conference on Computer Graphics. 1997. 353~360.
  • 6[6]Cosatto E, Potamianos G, Graf HP. Audio-Visual unit selection for the synthesis of photo-realistic talking-heads. In: IEEE International Conference on Multimedia and Expo (Ⅱ). 2000. 619~622.
  • 7[7]Steve M, Andrew B. Modeling visual coarticulation in synthetic talking heads using a lip motion unit inventory with concatenative synthesis. In: Yuan BZ, Huang TY, Tang XF, eds. Proceedings of the 6th International Conference on Spoken Language Processing (Ⅱ). Beijing: China Military Friendship Publish, 2000. 759~762.
  • 8[8]International Standard. Information technology-coding of audio-visual objects (Part 2). Visual; Admendment 1: Visual extensions, ISO/IEC 14496-2: 1999/Amd.1:2000(E).
  • 9[9]Zhong J, Olive J. Cloning synthetic talking heads. In: Proceedings of the 3rd ESCA/COCOSDA Workshop on Speech Synthesis. 1998. 26~29.
  • 10[10]Le Goff B, Benoit C. A text-to-audiovisual-speech synthesizer for French. In: Proceedings of the 4th International Conference on Spoken Language Processing (Ⅳ). 1996. 2163~2166.

共引文献18

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部