MusicFace: Music-driven expressive singing face synthesis

导出

摘要 It remains an interesting and challenging problem to synthesize a vivid and realistic singing face driven by music. In this paper, we present a method for this task with natural motions for the lips, facial expression, head pose, and eyes. Due to the coupling of mixed information for the human voice and backing music in common music audio signals, we design a decouple-and-fuse strategy to tackle the challenge. We first decompose the input music audio into a human voice stream and a backing music stream. Due to the implicit and complicated correlation between the two-stream input signals and the dynamics of the facial expressions, head motions, and eye states, we model their relationship with an attention scheme, where the effects of the two streams are fused seamlessly. Furthermore, to improve the expressivenes of the generated results, we decompose head movement generation in terms of speed and direction, and decompose eye state generation into short-term blinking and long-term eye closing, modeling them separately. We have also built a novel dataset, SingingFace, to support training and evaluation of models for this task, including future work on this topic. Extensive experiments and a user study show that our proposed method is capable of synthesizing vivid singing faces, qualitatively and quantitatively better than the prior state-of-the-art.

作者 Pengfei Liu Wenjin Deng Hengda Li Jintai Wang Yinglin Zheng Yiwei Ding Xiaohu Guo Ming Zeng

机构地区 School of Informatics Department of Computer Science

出处《Computational Visual Media》 SCIE EI CSCD 2024年第1期119-136,共18页 计算可视媒体（英文版）

基金 This work was supported in part by grants from the National Key R&D Program of China(2021YFC3300403) National Natural Science Foundation of China(62072382) Yango Charitable Foundation,and the National Science Foundation(OAC-2007661).

关键词 face synthesis SINGING MUSIC generative adversarial network

分类号 TP391.41 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1Daniele Pernigotti,Margherita Calzavara,Marta Mancin.Product vs.Organization Approach for the Quantification of Carbon Footprint of Events:The Hero Dolomites Case Study[J].Journal of Environmental Science and Engineering(A),2023,12(5):184-192.
2Jie Su,Pin Lyu,Jiong Lu.Atomically Precise Imprinting π-Magnetism in Nanographenes via Probe Chemistry[J].Precision Chemistry,2023,1(10):565-575.
3王心严.艺术与科学交融的新视野——记第二届世界音乐人工智能大会[J].中央音乐学院学报,2023(4):150-158. 被引量：2
4Waleed Maqableh,Faisal Y.Alzyoud,Jamal Zraqou.Corrigendum to“The Use of Facial Expressions in Measuring Students’Interaction with Distance Learning Environments During the COVID-19 Crisis”Visual Informatics,Volume 7,Issue 1,March 2023,Pages 1–17[J].Visual Informatics,2023,7(4):115-115.
5Xiushang Xu,Amogh Kinikar,Marco Di Giovannantonio,Carlo A.Pignedoli,Pascal Ruffieux,Klaus Müllen,Roman Fasel,Akimitsu Narita.On-Surface Synthesis of Anthracene-Fused Zigzag Graphene Nanoribbons from 2,7-Dibromo-9,9′-bianthryl Reveals Unexpected Ring Rearrangements[J].Precision Chemistry,2024,2(2):81-87.
6Jingyi Mao,Yuchen Zhou,YifanWang,Junyu Li,Ziqing Liu,Fanliang Bu.Attention-Enhanced Voice Portrait Model Using Generative Adversarial Network[J].Computers, Materials & Continua,2024,79(4):837-855.
7Wang Hui.The Past and Present of Chopsticks[J].China Book International,2024(2):106-109.
8Lihan He,Tianguang Meng.How Facial Expressions of Recipients Influence Online Prosocial Behaviors?-Evidence from Big Data Analysis on Tencent Gongyi Platform[J].Journal of Social Computing,2023,4(4):337-356.
9Guanghui Shi,Shasha Mao,Shuiping Gou,Dandan Yan,Licheng Jiao,Lin Xiong.Adaptively Enhancing Facial Expression Crucial Regions via a Local Non-local Joint Network[J].Machine Intelligence Research,2024,21(2):331-348.
10Tianrun CHEN,Runlong CAO,Zejian LI,Ying ZANG,Lingyun SUN.Deep3DSketch-im:rapid high-fidelity AI 3D model generation by single freehand sketches[J].Frontiers of Information Technology & Electronic Engineering,2024,25(1):149-159.

Computational Visual Media

2024年第1期

浏览历史

内容加载中请稍等...

MusicFace: Music-driven expressive singing face synthesis

相关作者

相关机构

相关主题

浏览历史