期刊文献+
共找到4篇文章
< 1 >
每页显示 20 50 100
Emotion-Aware Music Driven Movie Montage
1
作者 刘伍琴 林敏轩 +4 位作者 黄海斌 马重阳 宋玉 董未名 徐常胜 《Journal of Computer Science & Technology》 SCIE EI CSCD 2023年第3期540-553,共14页
In this paper, we present Emotion-Aware Music Driven Movie Montage, a novel paradigm for the challenging task of generating movie montages. Specifically, given a movie and a piece of music as the guidance, our method ... In this paper, we present Emotion-Aware Music Driven Movie Montage, a novel paradigm for the challenging task of generating movie montages. Specifically, given a movie and a piece of music as the guidance, our method aims to generate a montage out of the movie that is emotionally consistent with the music. Unlike previous work such as video summarization, this task requires not only video content understanding, but also emotion analysis of both the input movie and music. To this end, we propose a two-stage framework, including a learning-based module for the prediction of emotion similarity and an optimization-based module for the selection and composition of candidate movie shots. The core of our method is to align and estimate emotional similarity between music clips and movie shots in a multi-modal latent space via contrastive learning. Subsequently, the montage generation is modeled as a joint optimization of emotion similarity and additional constraints such as scene-level story completeness and shot-level rhythm synchronization. We conduct both qualitative and quantitative evaluations to demonstrate that our method can generate emotionally consistent montages and outperforms alternative baselines. 展开更多
关键词 movie montage emotion analysis audio-visual modality contrastive learning
原文传递
Facial Image Attributes Transformation via Conditional Recycle Generative Adversarial Networks 被引量:4
2
作者 Huai-Yu Li wei-ming dong Bao-Gang Hu 《Journal of Computer Science & Technology》 SCIE EI CSCD 2018年第3期511-521,共11页
This study introduces a novel conditional recycle generative adversarial network for facial attribute transfor- mation, which can transform high-level semantic face attributes without changing the identity. In our app... This study introduces a novel conditional recycle generative adversarial network for facial attribute transfor- mation, which can transform high-level semantic face attributes without changing the identity. In our approach, we input a source facial image to the conditional generator with target attribute condition to generate a face with the target attribute. Then we recycle the generated face back to the same conditional generator with source attribute condition. A face which should be similar to that of the source face in personal identity and facial attributes is generated. Hence, we introduce a recycle reconstruction loss to enforce the final generated facial image and the source facial image to be identical. Evaluations on the CelebA dataset demonstrate the effectiveness of our approach. Qualitative results show that our approach can learn and generate high-quality identity-preserving facial images with specified attributes. 展开更多
关键词 generative adversarial network image editing facial attributes transformation
原文传递
Fast Multi-Operator Image Resizing and Evaluation 被引量:2
3
作者 wei-ming dong Guan-Bo Bao +1 位作者 Xiao-Peng Zhang Jean-Claude Paul 《Journal of Computer Science & Technology》 SCIE EI CSCD 2012年第1期121-134,共14页
Current multi-operator image resizing methods succeed in generating impressive results by using image similarity measure to guide the resizing process. An optimal operation path is found in the resizing space. However... Current multi-operator image resizing methods succeed in generating impressive results by using image similarity measure to guide the resizing process. An optimal operation path is found in the resizing space. However, their slow resizing speed caused by inefficient computation strategy of the bidirectional patch matching becomes a drawback in practical use. In this paper, we present a novel method to address this problem. By combining seam carving with scaling and cropping, our method can realize content-aware image resizing very fast. We define cost functions combing image energy and dominant color descriptor for all the operators to evaluate the damage to both local image content and global visual effect. Therefore our algorithm can automatically find an optimal sequence of operations to resize the image by using dynamic programming or greedy algorithm. We also extend our algorithm to indirect image resizing which can protect the aspect ratio of the dominant object in an image. 展开更多
关键词 image resizing multi-operator operator cost indirect resizing
原文传递
A Comparative Study of CNN-and Transformer-Based Visual Style Transfer 被引量:1
4
作者 Hua-Peng Wei Ying-Ying Deng +2 位作者 Fan Tang Xing-Jia Pan wei-ming dong 《Journal of Computer Science & Technology》 SCIE EI CSCD 2022年第3期601-614,共14页
Vision Transformer has shown impressive performance on the image classification tasks.Observing that most existing visual style transfer(VST)algorithms are based on the texture-biased convolution neural network(CNN),h... Vision Transformer has shown impressive performance on the image classification tasks.Observing that most existing visual style transfer(VST)algorithms are based on the texture-biased convolution neural network(CNN),here raises the question of whether the shape-biased Vision Transformer can perform style transfer as CNN.In this work,we focus on comparing and analyzing the shape bias between CNN-and transformer-based models from the view of VST tasks.For comprehensive comparisons,we propose three kinds of transformer-based visual style transfer(Tr-VST)methods(Tr-NST for optimization-based VST,Tr-WCT for reconstruction-based VST and Tr-AdaIN for perceptual-based VST).By engaging three mainstream VST methods in the transformer pipeline,we show that transformer-based models pre-trained on ImageNet are not proper for style transfer methods.Due to the strong shape bias of the transformer-based models,these Tr-VST methods cannot render style patterns.We further analyze the shape bias by considering the influence of the learned parameters and the structure design.Results prove that with proper style supervision,the transformer can learn similar texture-biased features as CNN does.With the reduced shape bias in the transformer encoder,Tr-VST methods can generate higher-quality results compared with state-of-the-art VST methods. 展开更多
关键词 transformer convolution neural network visual style transfer comparative study
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部