期刊文献+

基于纹理与几何解耦的说话人视频连续情感编辑模型

A continuous emotional editing model for talking head videos based on decoupling texture and geometry
原文传递
导出
摘要 说话人视频的情感编辑是计算机视觉和图形学当前研究热点之一,其目的是将一段中性情感的人物说话视频转为带有目标情感的说话视频.已有的方法难以同时兼顾高清晰度情感编辑、人脸三维属性的保持以及模型对不同目标人物的适用性.为同时满足上述要求,本文提出基于Basel人脸模型(Basel face model,BFM)条件的几何编辑网络作为几何情感编辑模块,保证了几何编辑在不同目标人物场景下的通用性;提出了基于人物分类器的纹理情感编辑模块,使得精细纹理的编辑可以迁移到多人任务之中,突破了以往情感编辑模型仅适用特定目标人物或适用多人模型生成质量不高的局限性.本文提出的模型可以实现连续控制情感编辑强度的效果.实验结果表明,本文提出的通用情感编辑模型在多人任务上的清晰度、人物保真度、情感编辑质量等各项指标均优于已有可适用于多人情感编辑的方法,并且在训练集中未出现的目标人物上也能实现自然的情感编辑,甚至在未见的人脸位姿的说话视频中也能获得合理的结果. The emotional editing of talking head videos is a popular research topic in computer vision and computer graphics that aims to convert a person’s talking video with neutral emotion into another talking video with a target emotion.Current methods cannot simultaneously consider high-resolution emotional editing,the maintenance of the 3D property of a human face,and adaptability for different persons.To address this problem,we propose the BFM(Basel face model)conditioned shape editing network as our shape-emotion editing module,which guarantees the feasibility of geometric editing in multiperson conditions.Furthermore,we propose the subject-classifier-based textural emotional editing module,which preserves high-fidelity facial texture in multiperson tasks.Our proposed method breaks the limitations of the previous emotional editing methods,which can only be applied to a specific person or cannot generate high-resolution results in multiperson conditions.The experiment shows that our model can achieve better clarity,identity preservation,and editing quality than previous multiperson emotional editing methods and can obtain a reasonable result on an unseen person and even an unseen head pose.Meanwhile,the experiment shows that our model can continuously control the intensity of emotional editing.
作者 吕天 温玉辉 孙志尧 刘永进 Tian LV;Yu-Hui WEN;Zhiyao SUN;Yong-Jin LIU(Department of Computer Science and Technology,Tsinghua University,Beijing 100084,China;School of Computer and Information Technology,Beijing Jiaotong University,Beijing 100044,China;Beijing Key Laboratory of Traffic Data Analysis and Mining,Beijing 100044,China)
出处 《中国科学:信息科学》 CSCD 北大核心 2023年第12期2423-2439,共17页 Scientia Sinica(Informationis)
基金 国家自然科学基金(批准号:62202257) 中国博士后科学基金(批准号:2021M701891)资助项目。
关键词 情感编辑 三维重建 深度学习 计算机视觉 神经网络 emotional editing 3D reconstruction deep learning computer vision neural network
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部