With the continuous promotion of computer technology, the application system of virtual simulation technology has been further optimized and improved, and has been widely used in various fields of social development, ...With the continuous promotion of computer technology, the application system of virtual simulation technology has been further optimized and improved, and has been widely used in various fields of social development, such as urban construction, interior design, industrial simulation and tourism teaching. China's three-dimensional animation production started relatively late, but has achieved good results with the support of related advanced technology in the process of development. Computer virtual simulation technology is an important technical support in the production of three-dimensional animation. In this paper, firstly, the related content of computer virtual simulation technology was introduced. Then, the specific application of this technology in the production of three-dimensional animation was further elaborated, so as to provide some reference for the improvement of the production effect of three-dimensional animation in the future.展开更多
Driving facial animation based on tens of tracked markers is a challenging task due to the complex topology and to the non-rigid nature of human faces. We propose a solution named manifold Bayesian regression. First a...Driving facial animation based on tens of tracked markers is a challenging task due to the complex topology and to the non-rigid nature of human faces. We propose a solution named manifold Bayesian regression. First a novel distance metric, the geodesic manifold distance, is introduced to replace the Euclidean distance. The problem of facial animation can be formulated as a sparse warping kernels regression problem, in which the geodesic manifold distance is used for modelling the topology and discontinuities of the face models. The geodesic manifold distance can be adopted in traditional regression methods, e.g. radial basis functions without much tuning. We put facial animation into the framework of Bayesian regression. Bayesian approaches provide an elegant way of dealing with noise and uncertainty. After the covariance matrix is properly modulated, Hybrid Monte Carlo is used to approximate the integration of probabilities and get deformation results. The experimental results showed that our algorithm can robustly produce facial animation with large motions and complex face models.展开更多
BACKGROUND Lateral facial clefts are atypical with a low incidence in the facial cleft spectrum.With the development of ultrasonography(US)prenatal screening,such facial malformations can be detected and diagnosed pre...BACKGROUND Lateral facial clefts are atypical with a low incidence in the facial cleft spectrum.With the development of ultrasonography(US)prenatal screening,such facial malformations can be detected and diagnosed prenatally rather than at birth.Although three-dimensional US(3DUS)can render the fetus'face via 3D reconstruction,the 3D images are displayed on two-dimensional screens without field depth,which impedes the understanding of untrained individuals.In contrast,a 3D-printed model of the fetus'face helps both parents and doctors develop a more comprehensive understanding of the facial malformation by creating more interactive aspects.Herein,we present an isolated lateral facial cleft case that was diagnosed via US combined with a 3D-printed model.CASE SUMMARY A 31-year-old G2P1 patient presented for routine prenatal screening at the 22nd wk of gestation.The coronal nostril-lip section of two-dimensional US(2DUS)demonstrated that the fetus'bilateral oral commissures were asymmetrical,and left oral commissure was abnormally wide.The left oblique-coronal section showed a cleft at the left oral commissure which extended to the left cheek.The results of 3DUS confirmed the cleft.Furthermore,we created a model of the fetal face using 3D printing technology,which clearly presented facial malformations.The fetus was diagnosed with a left lateral facial cleft,which was categorized as a No.7 facial cleft according to the Tessier facial cleft classification.The parents terminated the pregnancy at the 24th wk of gestation after parental counseling.CONCLUSION In the diagnostic course of the current case,in addition to the traditional application of 2D and 3DUS,we created a 3D-printed model of the fetus,which enhanced diagnostic evidence,benefited the education of junior doctors,improved parental counseling,and had the potential to guide surgical planning.展开更多
Based upon motion capture,a semi-automatic technique for fast facial animation was implemented. While capturing the facial expressions from a performer,a camera was used to record her /his front face as a texture map....Based upon motion capture,a semi-automatic technique for fast facial animation was implemented. While capturing the facial expressions from a performer,a camera was used to record her /his front face as a texture map. The radial basis function( RBF) technique was utilized to deform a generic facial model and the texture was remapped to generate a personalized face.Partitioning the personalized face into three regions and using the captured facial expression data,the RBF and Laplacian operator,and mean-value coordinates were implemented to deform each region respectively. With shape blending,the three regions were combined together to construct the final face model. Our results show that the technique is efficient in generating realistic facial animation.展开更多
Emotion Model is the basis of facial expression recognition system. The constructed emotional model should not only match facial expressions with emotions, but also reflect the location relationship between different ...Emotion Model is the basis of facial expression recognition system. The constructed emotional model should not only match facial expressions with emotions, but also reflect the location relationship between different emotions. In this way, it is easy to understand the current emotion of an individual through the analysis of the acquired facial expression information. This paper constructs an improved three-dimensional model for emotion based on fuzzy theory, which corresponds to the facial features to emotions based on the basic emotions proposed by Ekman. What’s more, the three-dimensional model for motion is able to divide every emotion into three different groups which can show the positional relationship visually and quantitatively and at the same time determine the degree of emotion based on fuzzy theory.展开更多
Introduction: Radiotherapy is often used to treat head and neck malignancies, with inevitable effects on the surrounding healthy tissues. We have reviewed the literature concerning the experimental irradiation of faci...Introduction: Radiotherapy is often used to treat head and neck malignancies, with inevitable effects on the surrounding healthy tissues. We have reviewed the literature concerning the experimental irradiation of facial bones in animals. Materials and Methods: A PubMed search was performed to retrieve animal experiments on the irradiation of facial bones that were published between January 1992 and January 2012. The search terms were “irradiation facial bone” and “irradiation osteoradionecrosis”. Results: Thirty-six publications were included. The irradiation sources were Cobalt60, orthovoltage, 4 - 6 megavolt photons, and brachytherapy. The total dose varied between 8 - 60 Gy in single or multiple fractions. The literature presents a broad range of animal studies that differ in terms of the in vivo model, irradiation, observation period, and evaluation of results. Discussion: The different animal models used leave many questions unanswered. A detailed and standardized description of the methodology and results would facilitate the comparability of future studies.展开更多
Facial Expression Recognition(FER)has been an interesting area of research in places where there is human-computer interaction.Human psychol-ogy,emotions and behaviors can be analyzed in FER.Classifiers used in FER hav...Facial Expression Recognition(FER)has been an interesting area of research in places where there is human-computer interaction.Human psychol-ogy,emotions and behaviors can be analyzed in FER.Classifiers used in FER have been perfect on normal faces but have been found to be constrained in occluded faces.Recently,Deep Learning Techniques(DLT)have gained popular-ity in applications of real-world problems including recognition of human emo-tions.The human face reflects emotional states and human intentions.An expression is the most natural and powerful way of communicating non-verbally.Systems which form communications between the two are termed Human Machine Interaction(HMI)systems.FER can improve HMI systems as human expressions convey useful information to an observer.This paper proposes a FER scheme called EECNN(Enhanced Convolution Neural Network with Atten-tion mechanism)to recognize seven types of human emotions with satisfying results in its experiments.Proposed EECNN achieved 89.8%accuracy in classi-fying the images.展开更多
To generate realistic three-dimensional animation of virtual character,capturing real facial expression is the primary task.Due to diverse facial expressions and complex background,facial landmarks recognized by exist...To generate realistic three-dimensional animation of virtual character,capturing real facial expression is the primary task.Due to diverse facial expressions and complex background,facial landmarks recognized by existing strategies have the problem of deviations and low accuracy.Therefore,a method for facial expression capture based on two-stage neural network is proposed in this paper which takes advantage of improved multi-task cascaded convolutional networks(MTCNN)and high-resolution network.Firstly,the convolution operation of traditional MTCNN is improved.The face information in the input image is quickly filtered by feature fusion in the first stage and Octave Convolution instead of the original ones is introduced into in the second stage to enhance the feature extraction ability of the network,which further rejects a large number of false candidates.The model outputs more accurate facial candidate windows for better landmarks recognition and locates the faces.Then the images cropped after face detection are input into high-resolution network.Multi-scale feature fusion is realized by parallel connection of multi-resolution streams,and rich high-resolution heatmaps of facial landmarks are obtained.Finally,the changes of facial landmarks recognized are tracked in real-time.The expression parameters are extracted and transmitted to Unity3D engine to drive the virtual character’s face,which can realize facial expression synchronous animation.Extensive experimental results obtained on the WFLW database demonstrate the superiority of the proposed method in terms of accuracy and robustness,especially for diverse expressions and complex background.The method can accurately capture facial expression and generate three-dimensional animation effects,making online entertainment and social interaction more immersive in shared virtual space.展开更多
Facial expression recognition(FER)has numerous applications in computer security,neuroscience,psychology,and engineering.Owing to its non-intrusiveness,it is considered a useful technology for combating crime.However,...Facial expression recognition(FER)has numerous applications in computer security,neuroscience,psychology,and engineering.Owing to its non-intrusiveness,it is considered a useful technology for combating crime.However,FER is plagued with several challenges,the most serious of which is its poor prediction accuracy in severe head poses.The aim of this study,therefore,is to improve the recognition accuracy in severe head poses by proposing a robust 3D head-tracking algorithm based on an ellipsoidal model,advanced ensemble of AdaBoost,and saturated vector machine(SVM).The FER features are tracked from one frame to the next using the ellipsoidal tracking model,and the visible expressive facial key points are extracted using Gabor filters.The ensemble algorithm(Ada-AdaSVM)is then used for feature selection and classification.The proposed technique is evaluated using the Bosphorus,BU-3DFE,MMI,CK^(+),and BP4D-Spontaneous facial expression databases.The overall performance is outstanding.展开更多
As a key link in human-computer interaction,emotion recognition can enable robots to correctly perceive user emotions and provide dynamic and adjustable services according to the emotional needs of different users,whi...As a key link in human-computer interaction,emotion recognition can enable robots to correctly perceive user emotions and provide dynamic and adjustable services according to the emotional needs of different users,which is the key to improve the cognitive level of robot service.Emotion recognition based on facial expression and electrocardiogram has numerous industrial applications.First,three-dimensional convolutional neural network deep learning architecture is utilized to extract the spatial and temporal features from facial expression video data and electrocardiogram(ECG)data,and emotion classification is carried out.Then two modalities are fused in the data level and the decision level,respectively,and the emotion recognition results are then given.Finally,the emotion recognition results of single-modality and multi-modality are compared and analyzed.Through the comparative analysis of the experimental results of single-modality and multi-modality under the two fusion methods,it is concluded that the accuracy rate of multi-modal emotion recognition is greatly improved compared with that of single-modal emotion recognition,and decision-level fusion is easier to operate and more effective than data-level fusion.展开更多
为解决现有图像仿真中动漫风格迁移网络存在图像失真和风格单一等问题,提出了适用于动漫人脸风格迁移和编辑的TGFE-TrebleStyleGAN(text-guided facial editing with TrebleStyleGAN)网络框架。利用潜在空间的向量引导生成人脸图像,并在...为解决现有图像仿真中动漫风格迁移网络存在图像失真和风格单一等问题,提出了适用于动漫人脸风格迁移和编辑的TGFE-TrebleStyleGAN(text-guided facial editing with TrebleStyleGAN)网络框架。利用潜在空间的向量引导生成人脸图像,并在TrebleStyleGAN中设计了细节控制模块和特征控制模块来约束生成图像的外观。迁移网络生成的图像不仅用作风格控制信号,还用作约束细粒度分割后的编辑区域。引入文本生成图像技术,捕捉风格迁移图像和语义信息的关联性。通过在开源数据集和自建配对标签的动漫人脸数据集上的实验表明:相较于基线模型DualStyleGAN,该模型的FID降低了2.819,SSIM与NIMA分别提升了0.028和0.074。集成风格迁移与编辑的方法能够确保在生成过程中既保留原有动漫人脸细节风格,又具备灵活的编辑能力,减少了图像的失真问题,在生成图像特征的一致性和动漫人脸图像风格相似性中表现更优。展开更多
In this paper,we present an efficient algorithm that generates lip-synchronized facial animation from a given vocal audio clip.By combining spectral-dimensional bidirectional long short-term memory and temporal attent...In this paper,we present an efficient algorithm that generates lip-synchronized facial animation from a given vocal audio clip.By combining spectral-dimensional bidirectional long short-term memory and temporal attention mechanism,we design a light-weight speech encoder that leams useful and robust vocal features from the input audio without resorting to pre-trained speech recognition modules or large training data.To learn subject-independent facial motion,we use deformation gradients as the internal representation,which allows nuanced local motions to be better synthesized than using vertex offsets.Compared with state-of-the-art automatic-speech-recognition-based methods,our model is much smaller but achieves similar robustness and quality most of the time,and noticeably better results in certain challenging cases.展开更多
In this paper, a facial animation system is proposed for capturing bothgeometrical information and illumination changes of surface details, called expression details, fromvideo clips simultaneously, and the captured d...In this paper, a facial animation system is proposed for capturing bothgeometrical information and illumination changes of surface details, called expression details, fromvideo clips simultaneously, and the captured data can be widely applied to different 2D face imagesand 3D face models. While tracking the geometric data, we record the expression details by ratioimages. For 2D facial animation synthesis, these ratio images are used to generate dynamic textures.Because a ratio image is obtained via dividing colors of an expressive face by those of a neutralface, pixels with ratio value smaller than one are where a wrinkle or crease appears. Therefore, thegradients of the ratio value at each pixel in ratio images are regarded as changes of a facesurface, and original normals on the surface can be adjusted according to these gradients. Based onthis idea, we can convert the ratio images into a sequence of normal maps and then apply them toanimated 3D model rendering. With the expression detail mapping, the resulted facial animations aremore life-like and more expressive.展开更多
To synthesize real-time and realistic facial animation, we present an effective algorithm which combines image- and geometry-based methods for facial animation simulation. Considering the numerous motion units in the ...To synthesize real-time and realistic facial animation, we present an effective algorithm which combines image- and geometry-based methods for facial animation simulation. Considering the numerous motion units in the expression coding system, we present a novel simplified motion unit based on the basic facial expression, and construct the corresponding basic action for a head model. As image features are difficult to obtain using the performance driven method, we develop an automatic image feature recognition method based on statistical learning, and an expression image semi-automatic labeling method with rotation invariant face detection, which can improve the accuracy and efficiency of expression feature identification and training. After facial animation redirection, each basic action weight needs to be computed and mapped automatically. We apply the blend shape method to construct and train the corresponding expression database according to each basic action, and adopt the least squares method to compute the corresponding control parameters for facial animation. Moreover, there is a pre-integration of diffuse light distribution and specular light distribution based on the physical method, to improve the plausibility and efficiency of facial rendering. Our work provides a simplification of the facial motion unit, an optimization of the statistical training process and recognition process for facial animation, solves the expression parameters, and simulates the subsurface scattering effect in real time. Experimental results indicate that our method is effective and efficient, and suitable for computer animation and interactive applications.展开更多
文摘With the continuous promotion of computer technology, the application system of virtual simulation technology has been further optimized and improved, and has been widely used in various fields of social development, such as urban construction, interior design, industrial simulation and tourism teaching. China's three-dimensional animation production started relatively late, but has achieved good results with the support of related advanced technology in the process of development. Computer virtual simulation technology is an important technical support in the production of three-dimensional animation. In this paper, firstly, the related content of computer virtual simulation technology was introduced. Then, the specific application of this technology in the production of three-dimensional animation was further elaborated, so as to provide some reference for the improvement of the production effect of three-dimensional animation in the future.
基金Project supported by the National Natural Science Foundation of China (No. 60272031), the National Basic Research Program (973) of China (No. 2002CB312101) and the Technology Plan Program of Zhejiang Province (No. 2003C21010), China
文摘Driving facial animation based on tens of tracked markers is a challenging task due to the complex topology and to the non-rigid nature of human faces. We propose a solution named manifold Bayesian regression. First a novel distance metric, the geodesic manifold distance, is introduced to replace the Euclidean distance. The problem of facial animation can be formulated as a sparse warping kernels regression problem, in which the geodesic manifold distance is used for modelling the topology and discontinuities of the face models. The geodesic manifold distance can be adopted in traditional regression methods, e.g. radial basis functions without much tuning. We put facial animation into the framework of Bayesian regression. Bayesian approaches provide an elegant way of dealing with noise and uncertainty. After the covariance matrix is properly modulated, Hybrid Monte Carlo is used to approximate the integration of probabilities and get deformation results. The experimental results showed that our algorithm can robustly produce facial animation with large motions and complex face models.
文摘BACKGROUND Lateral facial clefts are atypical with a low incidence in the facial cleft spectrum.With the development of ultrasonography(US)prenatal screening,such facial malformations can be detected and diagnosed prenatally rather than at birth.Although three-dimensional US(3DUS)can render the fetus'face via 3D reconstruction,the 3D images are displayed on two-dimensional screens without field depth,which impedes the understanding of untrained individuals.In contrast,a 3D-printed model of the fetus'face helps both parents and doctors develop a more comprehensive understanding of the facial malformation by creating more interactive aspects.Herein,we present an isolated lateral facial cleft case that was diagnosed via US combined with a 3D-printed model.CASE SUMMARY A 31-year-old G2P1 patient presented for routine prenatal screening at the 22nd wk of gestation.The coronal nostril-lip section of two-dimensional US(2DUS)demonstrated that the fetus'bilateral oral commissures were asymmetrical,and left oral commissure was abnormally wide.The left oblique-coronal section showed a cleft at the left oral commissure which extended to the left cheek.The results of 3DUS confirmed the cleft.Furthermore,we created a model of the fetal face using 3D printing technology,which clearly presented facial malformations.The fetus was diagnosed with a left lateral facial cleft,which was categorized as a No.7 facial cleft according to the Tessier facial cleft classification.The parents terminated the pregnancy at the 24th wk of gestation after parental counseling.CONCLUSION In the diagnostic course of the current case,in addition to the traditional application of 2D and 3DUS,we created a 3D-printed model of the fetus,which enhanced diagnostic evidence,benefited the education of junior doctors,improved parental counseling,and had the potential to guide surgical planning.
基金Youth Foundation of Higher Education Scientific Research of Hebei Province,China(No.2010228)Foundation for Returned Overseas Scholars of Hebei Province,China(No.C2013003015)
文摘Based upon motion capture,a semi-automatic technique for fast facial animation was implemented. While capturing the facial expressions from a performer,a camera was used to record her /his front face as a texture map. The radial basis function( RBF) technique was utilized to deform a generic facial model and the texture was remapped to generate a personalized face.Partitioning the personalized face into three regions and using the captured facial expression data,the RBF and Laplacian operator,and mean-value coordinates were implemented to deform each region respectively. With shape blending,the three regions were combined together to construct the final face model. Our results show that the technique is efficient in generating realistic facial animation.
基金Supported by National Natural Science Foundation of China(61303150,61472393) China Postdoctoral Science Foundation(2012M521248) Anhui Province Innovative Funds on Intelligent Speech Technology and Industrialization(13Z02008)
文摘Emotion Model is the basis of facial expression recognition system. The constructed emotional model should not only match facial expressions with emotions, but also reflect the location relationship between different emotions. In this way, it is easy to understand the current emotion of an individual through the analysis of the acquired facial expression information. This paper constructs an improved three-dimensional model for emotion based on fuzzy theory, which corresponds to the facial features to emotions based on the basic emotions proposed by Ekman. What’s more, the three-dimensional model for motion is able to divide every emotion into three different groups which can show the positional relationship visually and quantitatively and at the same time determine the degree of emotion based on fuzzy theory.
文摘Introduction: Radiotherapy is often used to treat head and neck malignancies, with inevitable effects on the surrounding healthy tissues. We have reviewed the literature concerning the experimental irradiation of facial bones in animals. Materials and Methods: A PubMed search was performed to retrieve animal experiments on the irradiation of facial bones that were published between January 1992 and January 2012. The search terms were “irradiation facial bone” and “irradiation osteoradionecrosis”. Results: Thirty-six publications were included. The irradiation sources were Cobalt60, orthovoltage, 4 - 6 megavolt photons, and brachytherapy. The total dose varied between 8 - 60 Gy in single or multiple fractions. The literature presents a broad range of animal studies that differ in terms of the in vivo model, irradiation, observation period, and evaluation of results. Discussion: The different animal models used leave many questions unanswered. A detailed and standardized description of the methodology and results would facilitate the comparability of future studies.
文摘Facial Expression Recognition(FER)has been an interesting area of research in places where there is human-computer interaction.Human psychol-ogy,emotions and behaviors can be analyzed in FER.Classifiers used in FER have been perfect on normal faces but have been found to be constrained in occluded faces.Recently,Deep Learning Techniques(DLT)have gained popular-ity in applications of real-world problems including recognition of human emo-tions.The human face reflects emotional states and human intentions.An expression is the most natural and powerful way of communicating non-verbally.Systems which form communications between the two are termed Human Machine Interaction(HMI)systems.FER can improve HMI systems as human expressions convey useful information to an observer.This paper proposes a FER scheme called EECNN(Enhanced Convolution Neural Network with Atten-tion mechanism)to recognize seven types of human emotions with satisfying results in its experiments.Proposed EECNN achieved 89.8%accuracy in classi-fying the images.
基金This research was funded by College Student Innovation and Entrepreneurship Training Program,grant number 2021055Z and S202110082031the Special Project for Cultivating Scientific and Technological Innovation Ability of College and Middle School Students in Hebei Province,Grant Number 2021H011404.
文摘To generate realistic three-dimensional animation of virtual character,capturing real facial expression is the primary task.Due to diverse facial expressions and complex background,facial landmarks recognized by existing strategies have the problem of deviations and low accuracy.Therefore,a method for facial expression capture based on two-stage neural network is proposed in this paper which takes advantage of improved multi-task cascaded convolutional networks(MTCNN)and high-resolution network.Firstly,the convolution operation of traditional MTCNN is improved.The face information in the input image is quickly filtered by feature fusion in the first stage and Octave Convolution instead of the original ones is introduced into in the second stage to enhance the feature extraction ability of the network,which further rejects a large number of false candidates.The model outputs more accurate facial candidate windows for better landmarks recognition and locates the faces.Then the images cropped after face detection are input into high-resolution network.Multi-scale feature fusion is realized by parallel connection of multi-resolution streams,and rich high-resolution heatmaps of facial landmarks are obtained.Finally,the changes of facial landmarks recognized are tracked in real-time.The expression parameters are extracted and transmitted to Unity3D engine to drive the virtual character’s face,which can realize facial expression synchronous animation.Extensive experimental results obtained on the WFLW database demonstrate the superiority of the proposed method in terms of accuracy and robustness,especially for diverse expressions and complex background.The method can accurately capture facial expression and generate three-dimensional animation effects,making online entertainment and social interaction more immersive in shared virtual space.
文摘Facial expression recognition(FER)has numerous applications in computer security,neuroscience,psychology,and engineering.Owing to its non-intrusiveness,it is considered a useful technology for combating crime.However,FER is plagued with several challenges,the most serious of which is its poor prediction accuracy in severe head poses.The aim of this study,therefore,is to improve the recognition accuracy in severe head poses by proposing a robust 3D head-tracking algorithm based on an ellipsoidal model,advanced ensemble of AdaBoost,and saturated vector machine(SVM).The FER features are tracked from one frame to the next using the ellipsoidal tracking model,and the visible expressive facial key points are extracted using Gabor filters.The ensemble algorithm(Ada-AdaSVM)is then used for feature selection and classification.The proposed technique is evaluated using the Bosphorus,BU-3DFE,MMI,CK^(+),and BP4D-Spontaneous facial expression databases.The overall performance is outstanding.
基金supported by the Open Funding Project of National Key Laboratory of Human Factors Engineering(Grant NO.6142222190309)。
文摘As a key link in human-computer interaction,emotion recognition can enable robots to correctly perceive user emotions and provide dynamic and adjustable services according to the emotional needs of different users,which is the key to improve the cognitive level of robot service.Emotion recognition based on facial expression and electrocardiogram has numerous industrial applications.First,three-dimensional convolutional neural network deep learning architecture is utilized to extract the spatial and temporal features from facial expression video data and electrocardiogram(ECG)data,and emotion classification is carried out.Then two modalities are fused in the data level and the decision level,respectively,and the emotion recognition results are then given.Finally,the emotion recognition results of single-modality and multi-modality are compared and analyzed.Through the comparative analysis of the experimental results of single-modality and multi-modality under the two fusion methods,it is concluded that the accuracy rate of multi-modal emotion recognition is greatly improved compared with that of single-modal emotion recognition,and decision-level fusion is easier to operate and more effective than data-level fusion.
文摘为解决现有图像仿真中动漫风格迁移网络存在图像失真和风格单一等问题,提出了适用于动漫人脸风格迁移和编辑的TGFE-TrebleStyleGAN(text-guided facial editing with TrebleStyleGAN)网络框架。利用潜在空间的向量引导生成人脸图像,并在TrebleStyleGAN中设计了细节控制模块和特征控制模块来约束生成图像的外观。迁移网络生成的图像不仅用作风格控制信号,还用作约束细粒度分割后的编辑区域。引入文本生成图像技术,捕捉风格迁移图像和语义信息的关联性。通过在开源数据集和自建配对标签的动漫人脸数据集上的实验表明:相较于基线模型DualStyleGAN,该模型的FID降低了2.819,SSIM与NIMA分别提升了0.028和0.074。集成风格迁移与编辑的方法能够确保在生成过程中既保留原有动漫人脸细节风格,又具备灵活的编辑能力,减少了图像的失真问题,在生成图像特征的一致性和动漫人脸图像风格相似性中表现更优。
文摘In this paper,we present an efficient algorithm that generates lip-synchronized facial animation from a given vocal audio clip.By combining spectral-dimensional bidirectional long short-term memory and temporal attention mechanism,we design a light-weight speech encoder that leams useful and robust vocal features from the input audio without resorting to pre-trained speech recognition modules or large training data.To learn subject-independent facial motion,we use deformation gradients as the internal representation,which allows nuanced local motions to be better synthesized than using vertex offsets.Compared with state-of-the-art automatic-speech-recognition-based methods,our model is much smaller but achieves similar robustness and quality most of the time,and noticeably better results in certain challenging cases.
文摘In this paper, a facial animation system is proposed for capturing bothgeometrical information and illumination changes of surface details, called expression details, fromvideo clips simultaneously, and the captured data can be widely applied to different 2D face imagesand 3D face models. While tracking the geometric data, we record the expression details by ratioimages. For 2D facial animation synthesis, these ratio images are used to generate dynamic textures.Because a ratio image is obtained via dividing colors of an expressive face by those of a neutralface, pixels with ratio value smaller than one are where a wrinkle or crease appears. Therefore, thegradients of the ratio value at each pixel in ratio images are regarded as changes of a facesurface, and original normals on the surface can be adjusted according to these gradients. Based onthis idea, we can convert the ratio images into a sequence of normal maps and then apply them toanimated 3D model rendering. With the expression detail mapping, the resulted facial animations aremore life-like and more expressive.
基金supported by the 2013 Annual Beijing Technological and Cultural Fusion for Demonstrated Base Construction and Industrial Nurture (No. Z131100000113007)the National Natural Science Foundation of China (Nos. 61202324, 61271431, and 61271430)
文摘To synthesize real-time and realistic facial animation, we present an effective algorithm which combines image- and geometry-based methods for facial animation simulation. Considering the numerous motion units in the expression coding system, we present a novel simplified motion unit based on the basic facial expression, and construct the corresponding basic action for a head model. As image features are difficult to obtain using the performance driven method, we develop an automatic image feature recognition method based on statistical learning, and an expression image semi-automatic labeling method with rotation invariant face detection, which can improve the accuracy and efficiency of expression feature identification and training. After facial animation redirection, each basic action weight needs to be computed and mapped automatically. We apply the blend shape method to construct and train the corresponding expression database according to each basic action, and adopt the least squares method to compute the corresponding control parameters for facial animation. Moreover, there is a pre-integration of diffuse light distribution and specular light distribution based on the physical method, to improve the plausibility and efficiency of facial rendering. Our work provides a simplification of the facial motion unit, an optimization of the statistical training process and recognition process for facial animation, solves the expression parameters, and simulates the subsurface scattering effect in real time. Experimental results indicate that our method is effective and efficient, and suitable for computer animation and interactive applications.