摘要
人物图像合成技术最近已成为一个研究热点,在网上购物、社交平台等有着重要的作用.针对于姿态转换任务,姿态信息的引导有局限性,视角变换时生成模型难以处理复杂的人物外观特征.为了解决以上问题,首先提出多尺度特征融合的编码解码结构生成具有目标姿态的人体解析图作为辅助信息取代简单的姿态节点,然后提出一个多任务生成网络,将预训练好的VGG网络和可训练的卷积神经网络组合在一起提高网络的特征提取能力,同时生成粗糙结果、光流和掩码,通过一个光流引导的变形模块和融合模块将多任务结果融合在一起,保留了特征级的人物轮廓信息和像素级的纹理细节信息,生成更精准的目标姿态的人物图像.在多类别大型服装数据集DeepFashion上验证了所提出算法的有效性.
Human image synthesis techniques have recently become a research hotspot with important roles in online shopping and social platforms.For the pose transformation task,there are limitations in guiding the pose information,and it is difficult to generate models to deal with complex human appearance features during perspective transformation.In order to solve the above problems,we first propose a multi-scale feature fusion encode-decode architecture to generate a human parsing map with a target pose as auxiliary information instead of simple pose nodes,and then propose a multi-task generation network that combines a pre-trained VGG network and a trainable convolutional neural network together to improve the feature extraction capability of the network,while generating rough results,optical flow and masks through a optical flow-guided deformation module and a fusion module to fuse the multitask results together,preserving feature-level person contour information and pixel-level texture detail information to generate more accurate images of people in the target pose.The effectiveness of the proposed algorithm is verified on the multi-category large clothing dataset DeepFashion.
作者
谭台哲
钟晓静
杨卓
刘洋
黄丹
TAN Tai-zhe;ZHONG Xiao-jing;YANG Zhuo;LIU Yang;HUANG Dan(School of Computer,Guangdong University of Technology,Guangzhou 510006,China;Laboratory of Interaction and Visual Information,Guangdong University of Technology,Guangzhou 510006,China)
出处
《小型微型计算机系统》
CSCD
北大核心
2022年第11期2381-2386,共6页
Journal of Chinese Computer Systems
基金
国家自然科学基金项目(61907009)资助。
关键词
人体解析
图像合成
生成网络
姿态转换
human parsing
image synthesis
generative network
pose transfer