摘要
针对Vision Transformer(ViT)艺术风格迁移方法存在的局部信息提取能力不足、迁移效率低和风格化结果中出现伪影的问题,提出一种轻量级ViT和对抗生成网络(GAN)相结合的方法LVGAST。该方法利用局部和全局信息的互补提高网络的推理速度与风格化质量,并通过对抗训练增强风格化结果的艺术真实感,并与6种最先进的艺术风格迁移方法进行定性和定量比较。结果表明:在定性方面,LVGAST的视觉效果更具艺术真实感;在定量方面,LVGAST分别在SSIM、Style loss上达到了0.499、1.452,且推理速度在ViT类方法中达到最快(0.215 s/张)。LVGAST结合了卷积神经网络和ViT网络的优点,提高了风格化效率,同时引入了判别网络,使风格化结果更加真实。
Aiming at the problems of insufficient local information extraction capability,low style transfer efficiency,and artifacts in current Vision Transformer(ViT)-based artistic style transfer methods,a lightweight ViT and adversarial generative network(GAN)combination method LVGAST is proposed.The method uses the complementarity of local and global information to improve the inference efficiency and stylization quality,and enhance the artistic realism of the stylization results through adversarial training.Qualitative and quantitative comparative analysis with six other style transfer methods is carried out.The results show that:in qualitative aspect,LVGAST visual effect is more realistic in art;in quantitative terms,LVGAST reaches 0.499 and 1.452 in SSIM and Style loss,respectively,and achieves the fastest inference speed among the ViT-based methods(0.215s per piece).LVGAST combines the advantages of convolutional neural networks and ViTs to enhance stylization efficiency and introduces a discriminative network to achieve more artistically realistic stylization.
作者
庾晨龙
邵叱风
YU Chenlong;SHAO Chifeng(School of Computer Science and Engineering,Anhui University of Science and Technology,Huainan Anhui 232001,China)
出处
《兰州工业学院学报》
2024年第3期90-94,共5页
Journal of Lanzhou Institute of Technology
基金
安徽省教育厅重点项目(2022AH051638,2022AH051651)。