摘要
文中介绍了一种新型的文本驱动人脸动画合成技术,该技术通过融合情绪模型以增强面部表情的表现力。这一技术主要由两个核心部分构成:面部情感模拟和唇形与语音的一致性。首先,通过对输入文本的深度分析,识别出其中包含的情感类型及其强度。然后,基于这些情感信息,应用三维自由变形算法(DFFD)来生成相应的面部表情。与此同时,收集人类发音时的语音音素和唇形数据,并利用强制对齐技术,将这些数据与文本中的语音音素在时间上进行精确匹配,从而产生一系列唇部关键点的变化。随后,通过线性插值方法生成中间帧,以进一步细化唇部运动的时间序列。最后,使用DFFD算法根据这些时间序列数据合成相应的唇形动画。通过对面部情感和唇形动画进行细致的权重配比,成功实现了高度逼真的虚拟人脸表情动画。该研究不仅解决了文本驱动面部表情合成中的信息缺失问题,而且克服了表情单一和面部表情与唇形不协调的挑战,为人机交互、游戏开发、影视制作等领域提供了一种创新的应用方案。
This paper presents an innovative text-driven facial animation synthesis technique,which integrates emotion models to enhance the expressiveness of facial expressions.The methodology is composed of two core components:facial emotion simulation and the consistency between lip movements and speech.Initially,a deep analysis of the input text identifies the types of emotions contained and their intensities.Subsequently,these emotional cues are utilized to generate corresponding facial expressions using the three-dimensional free-form deformation algorithm(DFFD).Concurrently,phonemes and lip movement data from human speech are collected.These are then precisely aligned with the phonemes in the text over time using forced alignment technology,resulting in a sequence of changes in lip key points.Following this,intermediate frames are generated through linear interpolation to further refine the timeline of lip movements.Finally,the DFFD algorithm synthesizes the lip animation based on this time series data.By meticulously balancing the weights between facial emotions and lip animations,this approach successfully achieves highly realistic virtual facial expressions.
作者
刘增科
殷继彬
LIU Zengke;YIN Jibin(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China)
出处
《计算机科学》
CSCD
北大核心
2024年第S02期313-320,共8页
Computer Science