摘要
One-shot face reenactment is a challenging task due to the identity mismatch between source and driving faces.Most existing methods fail to completely eliminate the interference of driving subjects’identity information,which may lead to face shape distortion and undermine the realism of reenactment results.To solve this problem,in this paper,we propose using a 3D morphable model(3DMM)for explicit facial semantic decomposition and identity disentanglement.Instead of using 3D coefficients alone for reenactment control,we take advantage of the generative ability of 3DMM to render textured face proxies.These proxies contain abundant yet compact geometric and semantic information of human faces,which enables us to compute the face motion field between source and driving images by estimating the dense correspondence.In this way,we can approximate reenactment results by warping source images according to the motion field,and a generative adversarial network(GAN)is adopted to further improve the visual quality of warping results.Extensive experiments on various datasets demonstrate the advantages of the proposed method over existing state-of-the-art benchmarks in both identity preservation and reenactment fulfillment.
基金
supported in part by the Beijing Municipal Natural Science Foundation,China(No.4222054)
in part by the National Natural Science Foundation of China(Nos.62276263 and 62076240)
the Youth Innovation Promotion Association CAS,China(No.Y2023143).