期刊文献+

融合通道位置注意力机制和并行空洞卷积的人脸年龄合成 被引量:1

Face age synthesis fusing channel-coordinate attention mechanism and parallel dilated convolution
原文传递
导出
摘要 目的人脸年龄合成旨在合成指定年龄人脸图像的同时保持高可信度的人脸,是计算机视觉领域的热门研究方向之一。然而目前主流人脸年龄合成模型过于关注纹理信息,忽视了与人脸相关的多尺度特征,此外网络存在对身份信息筛选不佳的问题。针对以上问题,提出一种融合通道位置注意力机制和并行空洞卷积的人脸年龄合成网络(generative adversarial network(GAN)composed of the parallel dilated convolution and channel-coordinate atten⁃tion mechanism,PDA-GAN)。方法PDA-GAN基于生成对抗网络提出了并行三通道空洞卷积残差块和通道—位置注意力机制。并行三通道空洞卷积残差块将3种膨胀系数空洞卷积提取的不同尺度人脸特征融合,提升了特征尺度上的多样性和总量上的丰富度;通道—位置注意力机制通过对人脸特征的长度、宽度和深度显著性计算,定位图像中与年龄高度相关的通道和空间位置区域,增强了网络对通道和空间位置上敏感特征的表达能力,解决了特征冗余问题。结果实验在Flickr高清人脸数据集(Flickr-faces-high-quality,FFHQ)上训练,在名人人脸属性高清数据集(large-scale celebfaces attributes dataset-high quality,Celeba-HQ)上测试,将本文提出的PDA-GAN与最新的3种人脸年龄图像合成网络进行定性和定量比较,以验证本文方法的有效性。实验结果表明,PDA-GAN显著提升了人脸年龄合成的身份置信度和年龄估计准确度,具有良好的身份信息保留和年龄操控能力。结论本文方法能够合成具有较高真实度和准确性的目标年龄人脸图像。 Objective Face age synthesis is one of the most popular research fields in computer vision aiming at synthesiz⁃ing face images of specified ages while maintaining high fidelity.With the continuous progress of science and technology,face age synthesis technology is being gradually applied in face recognition,film special effects,public security,and other fields with a very wide range of application scenarios.The generative adversarial network(GAN)is one of the most widely used deep learning models in face synthesis.The generator and discriminator of GAN fight each other to generate images that are real enough to be fake.While GAN and its variant models have achieved good synthesis results,some deficiencies remain unaddressed.First,in order to synthesize images that are close to the target age,the current face age synthesis mod⁃els only limit the process of age change to texture information and ignore multi-scale features,such as contour,hair color,and texture,on the face.Second,the limited receptive field of the convolutional layer hinders the full convolutional net⁃work from extracting multi-scale features in the image.These problems greatly restrict the face age image synthesis effect of GAN.To solve these problems,this paper proposes a GAN composed of the parallel dilated convolution and channelcoordinate attention mechanism(PDA-GAN).Method PDA-GAN proposes a parallel three-channel dilated convolutional residual block(PTDCRB)and a channel-coordinate attention mechanism(CCAM)based on generative adversarial net⁃works.PTDCRB is introduced in the generator network of the baseline.Each PTDCRB comprises three parallel dilated con⁃volution channels that extract features at the same time.The dilated convolutions on different branches set expansion coeffi⁃cients of[1,2,3],respectively.Each branch of PTDCRB shares weights and reduces the amount of network parameters.The first layer of each branch in PTDCRB uses a 1×1 convolutional layer,the second layer is a dilated convolution with different expansion coefficients,and the third layer uses a 1×1 convolutional layer to reduce dimensionality and improve computational efficiency.Meanwhile,CCAM significantly screens the channel dimension of the feature vector,retains meaningful channel information in the feature,and learns the importance of different channels in order to avoid feature redundancy.CCAM then embeds the position information into the feature vector after channel attention and fuses them together after calculating the attention mechanism along the two orthogonal directions of length and width.The purpose of CCAM is to easily capture the dependencies of features at different positions.Result An experiment is conducted on the FFHQ dataset,samples in the Celeba-HQ dataset are selected as the test set,and PDA-GAN is qualitatively and quantita⁃tively compared with the three latest face age image synthesis networks HRFAE,LIFE,and SAM to verify its effective⁃ness.Age accuracy and identity consistency are adopted as quantitative indicators.PDA-GAN achieves the best accuracy for synthetic age images,with an average prediction difference of 4.09.The identity confidence can reach 99.2%when synthesizing a 30-year-old face.In the age-independent attribute retention experiment,PDA-GAN outperforms the other models in both quantitative indicators,with a gender retention rate of 99.7%and emotion retention rate of 93.2%.An ablation experiment is performed to further prove the effectiveness of each module of PDA-GAN,where PTDCRB is intro⁃duced into different layers of the generator backbone network.Experimental results show that PTDCRB-3 significantly improves identity confidence and age estimation accuracy.Four PTDCRB expansion coefficient sets are then established to train the network,and an expansion coefficient of[1,2,3]needs to be achieved to confirm the optimality of model iden⁃tity confidence and predicted age distribution.The standard generator structure and the generator structure introducing the channel-coordinate attention mechanism are then tested for their performance on age synthesis accuracy and identity verifi⁃cation confidence.Experimental results show that the identity retention and age synthesis abilities are significantly improved after adding the channel-coordinate attention mechanism.Conclusion This study proposes a parallel threechannel dilated convolution residual block with shared weights that captures feature information at each scale and enhances the richness of the model detail features.To enhance the expressiveness of the model on sensitive features,this paper pro⁃poses a channel-coordinate attention mechanism that learns features of the channel and spatial dimensions simultaneously.Under the combined effect of the parallel three-channel dilated convolution residual block and the channel-position atten⁃tion mechanism,the identity preservation ability and age synthesis accuracy of the model for face images are improved.Experimental results show that the proposed method outperforms other popular methods for face age synthesis tasks and can synthesize natural and realistic face images of the target age with high fidelity and accuracy.
作者 张珂 于婷婷 石超君 娄文硕 刘阳 Zhang Ke;Yu Tingting;Shi Chaojun;Lou Wenshuo;Liu Yang(Department of Electronic and Communication Engineering,North China Electric Power University,Baoding 071003,China;Hebei Key Laboratory of Power Internet of Things Technology,North China Electric Power University,Baoding 071003,China)
出处 《中国图象图形学报》 CSCD 北大核心 2023年第12期3870-3883,共14页 Journal of Image and Graphics
基金 国家自然科学基金项目(62076093,62206095) 中央高校基本科研业务费专项资金资助(2022MS078,2020MS099,2020YJ006)。
关键词 图像合成 人脸年龄 生成对抗网络(GAN) 空洞卷积 注意力机制 image synthesis face age generative adversarial network(GAN) dilated convolution attention mechanism
  • 相关文献

参考文献4

二级参考文献11

共引文献20

同被引文献9

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部