提出一种从整体到局部优化的风格迁移(global-local based style transfer,G-LST)模型.首先,利用广泛的源端数据进行迭代优化来自动构建高质量的伪平行数据,并通过联合训练来提升模型对整体风格的语义感知;随后,利用常识性知识修正词级...提出一种从整体到局部优化的风格迁移(global-local based style transfer,G-LST)模型.首先,利用广泛的源端数据进行迭代优化来自动构建高质量的伪平行数据,并通过联合训练来提升模型对整体风格的语义感知;随后,利用常识性知识修正词级的细粒度风格来增强局部风格的表现,同时兼顾整体与局部风格,提高风格转换的准确度.基于GYAFC数据集的实验结果表明,相较于目前表现最佳的文本风格迁移模型,G-LST模型在E&M与F&R两个领域数据上的风格转换准确率分别提高了2.70%和4.47%,内容保留与风格准确率的综合指标分别提升了1.18%和1.95%.展开更多
We propose a novel unsupervised image captioning method.Image captioning involves two fields of deep learning,natural language processing and computer vision.The excessive pursuit ofmodel evaluation results makes the ...We propose a novel unsupervised image captioning method.Image captioning involves two fields of deep learning,natural language processing and computer vision.The excessive pursuit ofmodel evaluation results makes the caption style generated by the model too monotonous,which is difficult to meet people’s demands for vivid and stylized image captions.Therefore,we propose an image captioning model that combines text style transfer and image emotion recognition methods,with which the model can better understand images and generate controllable stylized captions.The proposed method can automatically judge the emotion contained in the image through the image emotion recognition module,better understand the image content,and control the description through the text style transfermethod,thereby generating captions thatmeet people’s expectations.To our knowledge,this is the first work to use both image emotion recognition and text style control.展开更多
文摘提出一种从整体到局部优化的风格迁移(global-local based style transfer,G-LST)模型.首先,利用广泛的源端数据进行迭代优化来自动构建高质量的伪平行数据,并通过联合训练来提升模型对整体风格的语义感知;随后,利用常识性知识修正词级的细粒度风格来增强局部风格的表现,同时兼顾整体与局部风格,提高风格转换的准确度.基于GYAFC数据集的实验结果表明,相较于目前表现最佳的文本风格迁移模型,G-LST模型在E&M与F&R两个领域数据上的风格转换准确率分别提高了2.70%和4.47%,内容保留与风格准确率的综合指标分别提升了1.18%和1.95%.
基金supported by the National Key Research&Development Program (Grant No.2018YFC0831700)National Natural Science Foundation of China (Grant No.61671064,No.61732005).
文摘We propose a novel unsupervised image captioning method.Image captioning involves two fields of deep learning,natural language processing and computer vision.The excessive pursuit ofmodel evaluation results makes the caption style generated by the model too monotonous,which is difficult to meet people’s demands for vivid and stylized image captions.Therefore,we propose an image captioning model that combines text style transfer and image emotion recognition methods,with which the model can better understand images and generate controllable stylized captions.The proposed method can automatically judge the emotion contained in the image through the image emotion recognition module,better understand the image content,and control the description through the text style transfermethod,thereby generating captions thatmeet people’s expectations.To our knowledge,this is the first work to use both image emotion recognition and text style control.