期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
Unsupervised image translation with distributional semantics awareness
1
作者 Zhexi Peng He Wang +2 位作者 yanlin weng Yin Yang Tianjia Shao 《Computational Visual Media》 SCIE EI CSCD 2023年第3期619-631,共13页
Unsupervised image translation(UIT)studies the mapping between two image domains.Since such mappings are under-constrained,existing research has pursued various desirable properties such as distributional matching or ... Unsupervised image translation(UIT)studies the mapping between two image domains.Since such mappings are under-constrained,existing research has pursued various desirable properties such as distributional matching or two-way consistency.In this paper,we re-examine UIT from a new perspective:distributional semantics consistency,based on the observation that data variations contain semantics,e.g.,shoes varying in colors.Further,the semantics can be multi-dimensional,e.g.,shoes also varying in style,functionality,etc.Given two image domains,matching these semantic dimensions during UIT will produce mappings with explicable correspondences,which has not been investigated previously.We propose distributional semantics mapping(DSM),the first UIT method which explicitly matches semantics between two domains.We show that distributional semantics has been rarely considered within and beyond UIT,even though it is a common problem in deep learning.We evaluate DSM on several benchmark datasets,demonstrating its general ability to capture distributional semantics.Extensive comparisons show that DSM not only produces explicable mappings,but also improves image quality in general. 展开更多
关键词 generative adversarial networks(GANs) manifold alignment unsupervised learning image-to-image translation distributional semantics
原文传递
Speech-driven facial animation with spectral gathering and temporal attention 被引量:1
2
作者 Yujin CHAI yanlin weng +1 位作者 Lvdi WANG Kun ZHOU 《Frontiers of Computer Science》 SCIE EI CSCD 2022年第3期153-162,共10页
In this paper,we present an efficient algorithm that generates lip-synchronized facial animation from a given vocal audio clip.By combining spectral-dimensional bidirectional long short-term memory and temporal attent... In this paper,we present an efficient algorithm that generates lip-synchronized facial animation from a given vocal audio clip.By combining spectral-dimensional bidirectional long short-term memory and temporal attention mechanism,we design a light-weight speech encoder that leams useful and robust vocal features from the input audio without resorting to pre-trained speech recognition modules or large training data.To learn subject-independent facial motion,we use deformation gradients as the internal representation,which allows nuanced local motions to be better synthesized than using vertex offsets.Compared with state-of-the-art automatic-speech-recognition-based methods,our model is much smaller but achieves similar robustness and quality most of the time,and noticeably better results in certain challenging cases. 展开更多
关键词 speech-driven facial animation spectral-dimensional bidirectional long short-term memory temporal attention deformation gradients
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部