期刊文献+

基于深度条件扩散模型的零样本文本驱动虚拟人生成方法

Zero-shot text-driven avatar generation based on depth-conditioned diffusion model
下载PDF
导出
摘要 虚拟人生成技术对于虚拟现实和影视制作等领域有重要意义。针对现有虚拟人生成需要大量数据和制作成本等问题,提出一种基于扩散模型的零样本文本驱动的三维虚拟人生成方法,包括条件人体生成和迭代纹理细化2个阶段。第一阶段,首先利用神经网络初始化三维人体的隐式表示,然后,使用一个基于文本提示的深度条件扩散模型来引导神经隐式场生成用户所需的虚拟人模型。第二阶段,利用扩散模型进行去噪还原,针对第一阶段人体模型提供的纹理先验进行高精度的纹理图推理更新,进而迭代细化虚拟人的纹理表示,生成最终结果。使用该方法,用户可以创建一个生动的具有任意文本描述的虚拟人,而无需使用任何参考照片。实验结果表明,该方法可以在给定的文本提示条件下生成具有真实感的高质量、生动的虚拟人。 Avatars generation holds significant implications for various fields,including virtual reality and film production.To address the challenges associated with data volume and production costs in existing avatar generation methods,we proposed a zero-shot text-driven avatar generation method based on a depth-conditioned diffusion model.The method comprised two stages:conditional human body generation and iterative texture refinement.In the first stage,a neural network was employed to establish the implicit representation of the avatar.Subsequently,a depth-conditioned diffusion model was utilized to guide the neural implicit field in generating the required avatar model based on user input.In the second stage,the diffusion model was employed to generate high-precision inference texture images,leveraging the texture prior obtained in the first stage.The texture representation of the avatar model was enhanced through an iterative optimization scheme.With this method,users could create realistic avatars with vivid characteristics,all from text descriptions.Experimental results substantiated the effectiveness of the proposed method,showcasing that it could yield high-quality avatars exhibiting realism when generated in response to given text prompts.
作者 王吉 王森 蒋智文 谢志峰 李梦甜 WANG Ji;WANG Sen;JIANG Zhi-wen;XIE Zhi-feng;LI Meng-tian(Department of Film and Television Engineering,Shanghai University,Shanghai 200072,China;Shanghai Film Special Effects Engineering Technology Research Center,Shanghai 200072,China)
出处 《图学学报》 CSCD 北大核心 2023年第6期1218-1226,共9页 Journal of Graphics
关键词 扩散模型 虚拟人 零样本 文本驱动的生成 深度学习 diffusion model avatar generation zero-shot text-driven generation deep learning
  • 相关文献

参考文献3

二级参考文献9

共引文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部