虚拟人“双簧”—与语音同步的三维人脸动画的研究被引量：1

COLLABORATE TWO-VIRTUAL HUMAN SHOW—ON 3D FACIAL ANIMATION SYNCHRONISED WITH SPEECH

下载PDF

导出

摘要为了有效地合成人脸语音动画,提出一种与语音同步的三维人脸口型与表情动画合成的方法。首先,根据对人脸运动解剖学的研究,构建一个基于肌肉模型与微分几何学模型的三维人脸控制模型,通过数据结构的形式去控制肌肉模型和微分几何学模型实现人脸的运动,从而实现各种口型和表情运动的变化;然后,充分考虑汉语的发音特征,提出一种基于几何描述的并且符合汉语发音习惯的协同发音模型,从而产生逼真的三维人脸语音动画。仿真结果和实验对比表明,采用该方法可以得到符合汉语发音习惯的三维人脸口型动画,且合成的三维人脸表情较为自然,逼真。 In order to effectively synthesise the 3D face and speech animation,we proposed a method of animation synthesis with speech synchronisation for 3D mouth shapes and facial expressions. First,according to the research on anatomy of facial movement,we built a 3D face control model,which is based on muscle model and differential geometry model,it controls the muscle model and differential geometry model in the form of data structure to realise facial movement,so that implements the variation of mouth shapes and expression movements.Then,fully considering the characteristics of Chinese pronunciation,we presented a coarticulation model,which is based on geometric description and compliant with Chinese pronunciation habit,so that the vivid 3D face animation with speech could be generated. Finally,it was indicated by the comparison between simulation result and experiment that to apply this method could realise the 3D facial mouth shapes animation in accordance with the habits of Chinese pronunciation,and the synthesised 3D facial expressions were more natural and vivid.

作者米辉辉侯进李克豹甘凌云

机构地区西南交通大学信息科学与技术学院南京大学计算机软件新技术国家重点实验室

出处《计算机应用与软件》 CSCD 2015年第8期145-149,173,共6页 Computer Applications and Software

基金国家自然科学基金面上项目(61371165) 计算机软件新技术国家重点实验室开放课题(KFKT2013B22) 浙江大学CAD&CG国家重点实验室开放课题(A1416) 四川省动漫研究中心2012年度科研项目(DM201204)

关键词肌肉模型微分几何学协同发音建模语音动画 Muscle model Differential geometry Coarticulation modelling Speech animation

分类号 TP391.9 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献17

1Wang L J,Han W,Soong F K. High quality lip-sync animation for 3Dphoto-realistic talking head [ C ]//Proceedings of IEEE InternationalConference Acoustics, Speech, and Signal Processing, 2012 : 4529-4532.
2Han W,Wang L J,Soong F K,et al. Improved minimum converted traj-ectory error training for real-time speech-to-lips conversion[ C ]//Pro-ceedings of IEEE International Conference Acoustics,Speech,and Sig-nal Processing,2012:4513 -4516.
3Zoric G, Forchheimer G, Pandzic I S. On creating multimodal virtualhumans-real time speech driven facial gesturing[ J]. Multimedia Toolsand Applications,2011,54( 1 ) : 165 ~ 179.
4Deena S,AGalata. Speech-driven facial animation using a sharedGaussian process latent variable model [ C ] //Proceedings of 5 th Inter-national Symposium Advances in Visual Computing,2009 :89 ~ 100.
5Deena S,Hou S, AGalata. Visual speech synthesis by modeling coartic-ulation dynamics using a non-parametric switching state-space model[C ]//Proceedings of International Conference Multimodal Interfacesand the workshop on Machine Learning for Multimodal Interaction,2010:1 -8.
6Mariooryad S, Busso C. Generating human-like behaviors using jointspeech-driven models for conversational agents( J ]. IEEE Transactionson Audio, Speech, and Language Processing, 2012,20 ( 8 ) : 2329-2340.
7Jia J,Zhang S,Meng F B,et al. Emotional audio-visual speech synthe-sis based on PAD[ J]. IEEE Transactions on Audio,Speech,and Lan-guage Processing,2011,19(3) :570 -582.
8Wu Z Y,Meng H M,Yang H W ,et al. Modeling the expressivity of in-put text semantics for chinese text-to-speech synthesis in a spoken dia-log system [ J ]. IEEE Transactions on Audio, Speech, and LanguageProcessing,2009,17(8) :1567-1576.
9Zhang S,Wu Z Y,Meng H M,et al. Facial expression synthesis basedon emotion dimensions for affective talking avatar [ J ]. Smart Innova-tion ,Systems and Technologies ,2010(1) :109 —132.
10Tang H,Chu S,Hasegawa-Johnson M,et al. Partially supervised speak-er clustering^ J] . IEEE Transactions on Pattern Analysis and MachineIntelligence,2012,34(5) :959 -971.

二级参考文献28

1曹剑芬.普通话双音子和三音子结构系统代表语料集[J].语言文字应用,1997(1):62-70. 被引量：7
2徐向华,朱杰,郭强.汉语连续语音识别中的分级聚类算法的研究和应用[J].信号处理,2004,20(5):497-500. 被引量：2
3王志明,蔡莲红,艾海舟.基于数据驱动方法的汉语文本-可视语音合成(英文)[J].软件学报,2005,16(6):1054-1063. 被引量：16
4S Baek, B kim, K Lee. 3D Face Model Reconstruction From Single 2D Frontal hnage [ J ]. VRCAL, 2009:96-98.
5S F Wang, S H Lai. Reconstucting 3D Face Model with Associated Expression Deformation from a Single Face Image via Constructing a Low- Dimensional Expression Deformation Manifold [ J ]. PAMI, 2011,33(10) : 2115-2120.
6X Fan, Q Peng, M Zhong. 3D face reconstruction from single 2D image based on robust facial feature, points extraction and generic wire frame model [J]. Communications and Mobile Computing, 2010 : 396 -400.
7Z Chen, J Hou, D Zhang, X Qin. An annotation rule extraction algorithm for image retrieval [ J ]. Pattern Recognition Letters, 2012,33(10) :1257-1268.
8F Hu, Y Lin, B Zou, M Zhang. Individual 3D Face Generation Based on Candide-3 for Face Recognition [ J ]. Image and Signal Processing, 2008:646-648.
9TANG Hao,FU Yun,TU Ji-lin. Humanoid audio-visual avatar with emotive text-to-speech synthesis[J].IEEE Transactions on Multimedia,2008,(06):969-981.
10TANG Hao,FU Yun,TU Ji-lin. EAVA:a 3 D emotive audio-visual avatar[A].Washington,DC:IEEE Computer Society,2008.1-6.

共引文献15

1曾洪鑫,胡东波,胡志刚.文本与朗读语音共同驱动的汉语语音与口型匹配方案[J].计算机与现代化,2013(10):135-137. 被引量：1
2曾洪鑫,胡东波,胡志刚.浅析汉语语音与口型匹配的基本机理[J].电声技术,2013,37(10):44-48.
3吴翠娟,赵晖.可视化协同发音合成研究综述[J].现代计算机,2014,20(9):9-14.
4曾洪鑫,胡东波,胡志刚.双模态驱动的汉语语音与口型匹配控制模型[J].计算机工程与应用,2015,51(3):202-207. 被引量：1
5黄军伟,侯进,雷腾.基于合成的通用头部网格模型的表情合成[J].计算机应用研究,2015,32(4):1240-1243. 被引量：1
6米辉辉,侯进,李克豹,甘凌云.汉语语音同步的三维口型动画研究[J].计算机应用研究,2015,32(4):1244-1247. 被引量：3
7荣传振,岳振军,王渊,杨宇.模糊语言模型在唇读系统中的应用[J].信号处理,2015,31(10):1301-1306. 被引量：1
8唐郅,侯进.基于深度神经网络的语音驱动发音器官的运动合成[J].自动化学报,2016,42(6):923-930. 被引量：5
9曹宁哲,侯进,马文超.带眼球细节的情感虚拟人面部表情合成[J].计算机应用研究,2016,33(12):3839-3842. 被引量：1
10吴志明,侯进,位雪岭.基于运动分解与权重函数的嘴部中文语音动画[J].计算机应用研究,2016,33(12):3858-3862. 被引量：1

同被引文献9

1孙晓鹏,安丹丹,刘小丹.拼音文本驱动的任意嘴唇曲线的动画生成[J].计算机辅助设计与图形学学报,2008,20(12):1603-1608. 被引量：2
2李皓,陈艳艳,唐朝京.唇部子运动与权重函数表征的汉语动态视位[J].信号处理,2012,28(3):322-328. 被引量：12
3肖业清,侯进,王献.基于正面照的自适应三维头部网格模型研究[J].计算机仿真,2013,30(7):412-416. 被引量：5
4银珠.百度汉语语音识别获重大突破[J].计算机与网络,2015,41(20):77-77. 被引量：5
5杨东沿,赵伟,孔明明.基于端点检测的广播音频分割与分类[J].现代计算机（中旬刊）,2016(4):46-49. 被引量：3
6吴志明,侯进,位雪岭.基于运动分解与权重函数的嘴部中文语音动画[J].计算机应用研究,2016,33(12):3858-3862. 被引量：1
7吴传龙,宋世泽,王鼎蔚.三次样条在北斗卫星轨道插值中的应用[J].全球定位系统,2017,42(1):108-110. 被引量：5
8范鑫鑫,杨旭波.语音驱动的口型同步算法[J].东华大学学报（自然科学版）,2017,43(4):466-471. 被引量：1
9沈希忠,郑晓修.基于自适应分段的语音端点融合检测[J].应用技术学报,2018,18(3):273-277. 被引量：4

引证文献1

1张净波,杨元维,徐杰,蒋梦月,李鹏,杜李慧.语音驱动弗格森曲线合成嘴唇动画[J].计算机与数字工程,2021,49(8):1676-1681.

1米辉辉,侯进,李克豹,甘凌云.汉语语音同步的三维口型动画研究[J].计算机应用研究,2015,32(4):1244-1247. 被引量：3
2牛晓松,王洵,万寿红.特征点提取及图像变形在语音动画中的应用[J].计算机工程,2005,31(9):179-181. 被引量：1
3叶静,董兰芳,王洵,万寿红.一个基于改进的HMM的人脸语音动画合成系统[J].计算机工程,2005,31(13):165-167.
4万寿红,董兰芳,王洵.小波变换在人脸语音动画合成中的应用[J].计算机工程与应用,2005,41(28):49-51.
5贾熹滨,尹宝才,李敬华.语音同步的可视语音合成技术研究[J].北京工业大学学报,2005,31(6):656-661. 被引量：5
6李舟.关于做口型动画的一些技巧[J].影视制作,2009(11):47-48.
7徐莹,何本阳.一个真实感语音同步人脸动画系统[J].陕西科技大学学报（自然科学版）,2005,23(4):81-84.
8李淑红,尹小娟,句全.基于SOM网络的三维人脸表情识别[J].郑州轻工业学院学报（自然科学版）,2013,28(5):70-73.
9张腾飞,闵锐,王保云.基于特征区域自动分割的人脸表情识别[J].计算机工程,2011,37(10):146-148. 被引量：1
10尹小娟,肖勤,琚报德.基于SOM的三维人脸表情聚类[J].信息系统工程,2013(3):137-137. 被引量：1

计算机应用与软件

2015年第8期

浏览历史

内容加载中请稍等...

虚拟人“双簧”—与语音同步的三维人脸动画的研究被引量：1

参考文献17

二级参考文献28

共引文献15

同被引文献9

引证文献1

相关作者

相关机构

相关主题

浏览历史

虚拟人“双簧”—与语音同步的三维人脸动画的研究 被引量：1

参考文献17

二级参考文献28

共引文献15

同被引文献9

引证文献1

相关作者

相关机构

相关主题

浏览历史

虚拟人“双簧”—与语音同步的三维人脸动画的研究被引量：1