期刊文献+

基于最优传输理论的联合分布匹配方法及应用 被引量:4

Joint Distribution Matching Method and Applications Based on Optimal Transport Theory
下载PDF
导出
摘要 联合分布匹配问题是机器学习和计算机视觉领域的研究热点之一.该问题旨在学习双向映射以匹配两个域的联合分布,目前仍然面临两个重要挑战:第一:两个不同域之间的相关性信息难以被充分利用.第二:联合分布匹配问题难以建模和优化.基于最优传输理论,本文通过最小化两个域间联合分布的Wasserstein距离来解决上述挑战.首先,本文提出一个定理将难以求解的Wasserstein距离原问题转化为一个简单的优化问题,并设计了一个联合Wasserstein自编码器模型(JWAE)来求解该问题.然后,本文将JWAE成功应用在无监督图像翻译和跨域视频合成任务中,并生成高质量的图像和连贯的视频.实验结果表明,JWAE在两种任务中的定性和定量指标上均优于现有方法.比如,在“街景→语义分割”图像翻译任务中,JWAE的IS值比CycleGAN高0.59,FID值比CycleGAN小65.8.在“冬季→夏季”跨域视频合成任务中,JWAE的FID4video值比Slomo-Cycle小2.2. Joint distribution matching problem is one of the research hotspots in the field of machine learning and computer vision.This problem,which aims to learn bidirectional mappings to match joint distributions of two different domains,has two critical challenges:First,it is very difficult to exploit sufficient correlation information from the joint distributions of two different domains.In the unsupervised learning setting,there are two sets of samples drawn separately from two marginal distributions in two different domains.Based on the coupling theory,there exist an infinite set of joint distributions given two marginal distributions,and thus infinite bidirectional mappings between two different domains may exist.Therefore,directly learning the joint distributions without additional information between the marginal distributions is a highly ill-posed problem;Second,the joint distribution matching problem is hard to formulate and effectively optimize.One can directly apply some statistics divergence(e.g.,Wasserstein distance)to measure the divergence of joint distributions.Wasserstein distance is a measure in the optimal transport theory,which has been successfully applied in computer vision applications.However,directly optimizing the primal problem of Wasserstein distance may result in intractable computational cost and statistical difficulties.Recently,many studies have been proposed to address the joint distribution matching problem and learn the mappings in two domains separately,which cannot learn cross-domain correlations and may incur joint distribution mismatching problem.In this paper,relying on the optimal transport theory,we tackled these issues by minimizing Wasserstein distance of the joint distributions in two different domains.However,directly optimizing the primal problem of Wasserstein distance is intractable due to the computational cost and the statistical difficulties.Without loss of generality,two different domains can be assumed that they often share the same latent space(i.e.,images in different domains have similar characteristics),we then proposed a theorem to reduce the intractable optimization problem into a simple and feasible problem.With the help of the proposed theorem,we introduced a novel objective function and design a Joint Wasserstein Auto-Encoders(JWAE)to solve the joint distribution matching problem.Our novel objective function is composed of two parts,i.e.,reconstruction loss and distribution divergence.The reconstruction loss can be derived from Auto-Encoder and cycle mapping,while the distribution divergence needs to be optimized for three spaces(i.e.,source space,target space and latent space).In this way,we can learn good bidirectional mappings through minimizing the reconstruction loss and reducing the distribution divergence.In the experiments,we applied our proposed method to perform unsupervised image-to-image translation and cross-domain video-to-video synthesis,and generate high quality images and coherent videos.Both qualitative and quantitative comparisons demonstrate the superior performance of our method over several baseline methods.For example,on the“scene→segmentation”image-to-image translation task,the IS value of JWAE is 0.59 higher than that of CycleGAN,and the FID value of JWAE is 65.8 lower than that of CycleGAN.On the“winter→summer”video synthesis,the FID4video value of JWAE is 2.2 lower than that of Slomo-Cycle.
作者 曹杰彰 莫朗元 杜卿 国雍 赵沛霖 黄俊洲 谭明奎 CAO Jie-Zhang;MO Lang-Yuan;DU Qing;GUO Yong;ZHAO Pei-Lin;HUANG Jun-Zhou;TAN Ming-Kui(School of Software Engineering,South China University of Technology,Guangzhou 510006;Tencent AI Lab,Shenzhen,Guangdong 518054)
出处 《计算机学报》 EI CAS CSCD 北大核心 2021年第6期1233-1245,共13页 Chinese Journal of Computers
基金 广东省重点领域研发计划项目(2018B010107001) 国家自然科学基金重点项目(61836003) 广东省珠江人才计划创新创业团队(2017ZT07X183) 中央高校基本科研业务费专项资金(D2191240) 腾讯人工智能实验室犀牛鸟重点研究项目(JR201902)资助.
关键词 联合分布匹配 最优传输理论 Wasserstein距离 无监督图像翻译 跨域视频合成 joint distribution matching optimal transport theory Wasserstein distance unsupervised image translation cross-domain video synthesis
  • 相关文献

同被引文献12

引证文献4

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部