期刊文献+

改进参数控制的可视语音合成方法

Improved visual speech synthesis method of parameter control
下载PDF
导出
摘要 传统单音素对音节内部和音节之间的协同发音影响采用相同处理方法,为此,分析音节内部和音节之间两种不同协同发音现象对可视语音合成的不同影响,提出一种改进参数控制的可视语音合成方法。针对不同音节,不改变元、辅音视位峰值处权值函数的幅度及其变化速度参数,仅修改元、辅音的时间参数,使修改后的元、辅音视位参数能更好地模拟真实音节发音过程中发音器官的动态变化特征。实验结果表明,改进方法能有效地解决音节内协同发音的问题,改善了可视语音合成的质量。 To solve the unreasonable problem that the same processing method was adopted for coarticulation problem with inner syllabic and inter-syllabic, the different influences on the visual speech synthesis of different kinds of coarticulation phenomenons with inner-syllabic and inter-syllabic were analyzed, and an improved visual speech synthesis method based on monophone para- meter control was put forward. The amplitude of the weight function about the peak value and the rate of change of the vowels viseme and consonants viseme were unchanged, only the time parameters of vowels viseme and consonants viseme were modified. The modified parameters were used to simulate the dynamic change characteristics in the process of real syllable pronunciation organs, bringing about better performances. The feasibility of the method was validated by practical application. Results show that the improved method can solve the coarticulation problem with inner-syllabic, and improve the quality of the visual speech synthesis.
作者 刘学杰 赵晖
出处 《计算机工程与设计》 北大核心 2017年第4期989-995,共7页 Computer Engineering and Design
基金 国家自然科学基金项目(61261037 61561047)
关键词 可视语音合成 参数控制 维吾尔语 视位 协同发音 visual speech synthesis monophone parameter control Uyghur viseme coarticulation
  • 相关文献

参考文献4

二级参考文献31

  • 1王丽娟,曹志刚.基于HMM模型的语音单元边界的自动切分[J].数据采集与处理,2005,20(4):381-384. 被引量:4
  • 2[1]Cohen MM, Massaro DW. Modeling coarticulation in synthetic visual speech. In: Thalmann NM, Thalmann D, eds. Models Techniques in Computer Animation. Tokyo: Springer-Verlag, 1993. 139~156.
  • 3[2]Reveret L, Bailly G, Badin P. Mother: a new generation of talking heads providing a flexible articulatory control for video-realistic speech animation. In: Yuan Bao-Zong, Huang Tai-Yi, Tang Xiao-Fang, eds. Proceedings of the 6th International Conference on Spoken Language Processing (Ⅱ). Beijing: China Military Friendship Publish, 2000. 755~758.
  • 4[3]Brooke NM, Scott SD. Computer graphics animations of talking faces based on stochastic models. In: International Symposium on Speech, Image Processing and Neural Networks. 1994. 73~76.
  • 5[4]Masuko T, Kobayashi T, Tamura M. Text-to-Visual speech synthesis based on parameter generation from HMM. In: Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing (Ⅵ). 1998. 3745~3748.
  • 6[5]Bregler C, Covell M, Slaney M. Video rewrite: driving visual speech with audio. In: Proceedings of the ACM SIGGRAPH Conference on Computer Graphics. 1997. 353~360.
  • 7[6]Cosatto E, Potamianos G, Graf HP. Audio-Visual unit selection for the synthesis of photo-realistic talking-heads. In: IEEE International Conference on Multimedia and Expo (Ⅱ). 2000. 619~622.
  • 8[7]Steve M, Andrew B. Modeling visual coarticulation in synthetic talking heads using a lip motion unit inventory with concatenative synthesis. In: Yuan BZ, Huang TY, Tang XF, eds. Proceedings of the 6th International Conference on Spoken Language Processing (Ⅱ). Beijing: China Military Friendship Publish, 2000. 759~762.
  • 9[8]International Standard. Information technology-coding of audio-visual objects (Part 2). Visual; Admendment 1: Visual extensions, ISO/IEC 14496-2: 1999/Amd.1:2000(E).
  • 10[9]Zhong J, Olive J. Cloning synthetic talking heads. In: Proceedings of the 3rd ESCA/COCOSDA Workshop on Speech Synthesis. 1998. 26~29.

共引文献25

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部