摘要
图像编辑是指保持图像非编辑区域内容不变,编辑区域根据输入条件改变的过程。生成式图像编辑利用深度学习模型,以编辑条件作为指导,通过生成图像的方式来修改图像的内容,包括但不限于修改图像中的对象、风格、颜色、纹理等特征。生成式图像编辑是计算机视觉研究中不可或缺的一环。生成式图像编辑具有重要的理论和实际应用价值,引起了学术界和工业界的广泛关注,然而目前对相关研究的综述工作相对较少。本文针对极具代表性的生成式图像编辑方法进行了综述,首先根据编辑方法不同,将编辑任务分为基于属性、基于文本、基于线稿和基于实物范例引导的图像编辑,结合损失函数、类型以及特点等对不同方法进行分析总结;同时进一步探讨了各方法对生成质量的影响,并详尽阐述了数据集和评价指标;最后,指出该领域未来面临的挑战和可能的研究方向。
Image editing refers to the process of retaining the content of non-editable areas of an image while modifying the content of editable regions based on input conditions.Guided by editing conditions,generative image editing utilizes a deep learning model to modify the content of an image by generating it,including but not limited to modifying features such as objects,styles,colors,and textures in the image.Generative image editing is an essential component of computer vision research,holding significant theoretical and practical application value.It has garnered widespread attention in both academia and industry,however,there is a relative scarcity of comprehensive surveys on related research.The paper provides a review of representative generative image editing methods.Firstly,based on different editing methods,editing tasks are divided into attribute based,text based,line based,and physical example-guided image editing.Different methods are analyzed and summarized by combining loss functions,types,and characteristics.At the same time,the impact of various methods on generation quality is further explored,and the dataset and evaluation indicators are elaborated in detail.Finally,future challenges and potential research directions in this field are highlighted.
作者
程迪
史英杰
孙世鑫
杜方
王玮晶
CHENG Di;SHI Yingjie;SUN Shixin;DU Fang;WANG Weijing(School of Arts&Sciences,Beijing Institute of Fashion Technology,Beijing 100029,China;School of Information Engineering,Ningxia University,Yinchuan Ningxia 750021,China)
出处
《北京服装学院学报(自然科学版)》
CAS
2024年第3期1-15,共15页
Journal of Beijing Institute of Fashion Technology:Natural Science Edition
基金
国家自然科学基金项目(62062058)
纺织服装智能化湖北省工程研究中心开放课题(2023HBITF01)
北京市教育委员会科学研究计划项目(KM202210012002)。
关键词
图像编辑
深度学习
扩散模型
生成对抗网络
多模态基础模型
image editing
deep learning
diffusion models
generative adversarial networks
multimodal foundation models