Cross-CNN:基于CNN和Transformer混合模型的动画跨帧线稿着色算法

Cross-CNN:An Animation Cross-Frame Sketch Colorization Algorithm Based on Hybrid Model with CNN and Transformer

下载PDF

导出

摘要对长序列的动画线稿帧进行着色是计算机视觉中一项具有挑战性的任务.一方面,线稿中包含的信息较为稀疏,需要着色算法对缺失的信息进行推断;另一方面,连续帧之间的色彩需要保持一致,以确保整个视频的视觉质量.现有的着色算法多数只针对单张图片进行着色,这类算法只给出一个开放性的符合合理范围的色彩结果,无法适用于帧序列着色.另一些基于参考帧的着色算法,并没有将2帧之间的关系有机地联系起来,导致着色效果不够出色.在同一镜头序列中,同一对象的特征往往不会发生太大变化,因此,可以设计一个根据给定参考帧,即可给线稿自动着色的模型.为此,本文提出了基于CNN(Convolutional Neural Networks)和Transformer相结合的模型Cross-CNN,该模型能够从参考帧中寻找并匹配颜色,从而保证时间维度上的特征一致性.Cross-CNN模型参考帧和线稿帧在通道维度叠加,输入预训练的Resnet50网络提取局部融合特征,将融合特征图传给Transformer结构进行编码以提取全局特征.在Transformer结构中设计了交叉注意力机制更好地匹配远距离特征.最后使用带有跳层连接的卷积解码器完成着色图片输出.本文在数据集方面从8部电影中截取画面并经过严格筛选,最终制作了一个包含20000对二元组的数据集用于实验研究.Cross-CNN的SSIM(Structural SIMilarity)达到了0.932,高于SOTA算法0.014.本文算法代码链接:https://github.com/silenye/Cross-CNN. Coloring long sequences of animated sketch frames is a challenging task in computer vision.On one hand,the information contained in sketches is sparse,and coloring algorithms need to infer missing information.On the other hand,the colors between consecutive frames need to be consistent to ensure visual quality throughout the video.Most exist⁃ing coloring algorithms are designed for single images and only provide one open-ended,reasonable color result,which is not suitable for coloring frame sequences.Other reference-based coloring algorithms do not have an organic connection be⁃tween two frames,resulting in unsatisfactory coloring results.In the same shot sequence,the features of same object usually do not change too much.Therefore,a model that can automatically color sketches based on a given reference frame can be designed.This paper proposes a new model called Cross-CNN that combines convolutional neural networks(CNN)and Transformer.Our Cross-CNN can find and match colors from the reference frame,thus ensuring temporal feature consisten⁃cy.In this model,the reference frame and the sketch frame are superimposed in the channel dimension,and the pre-trained Resnet50 network is used to extract locally fused features.The fused feature map is then passed to the Transformer structure for encoding to extract global features.In the Transformer structure,a cross attention mechanism is designed to better match long-distance features.Finally,a convolutional decoder with skip connections is used to output the colored image.In terms of the dataset,this paper extracted frames from eight movies and conducted strict screening to create a dataset containing 20000 pairs of reference and sketch frames for experimental research.The SSIM(Structural SIMilarity)of Cross-CNN can reach 0.932,which is higher than the SOTA algorithm by 0.014.The algorithm codes link for this paper:https://github.com/silenye/Cross-CNN.

作者余毅丰钱江波严迪群王翀董理 YU Yi-feng;QIAN Jiang-bo;YAN Di-qun;WANG Chong;DONG Li(Faculty of Electrical Engineering and Computer Science,Ningbo University,Ningbo,Zhejiang 315000,China;Zhejiang Key Laboratory of Mobile Network Application Technology,Ningbo,Zhejiang 315000,China)

机构地区宁波大学信息科学与工程学院浙江省移动网应用技术重点实验室

出处《电子学报》 EI CAS CSCD 北大核心 2024年第7期2491-2502,共12页 Acta Electronica Sinica

基金国家自然科学基金(No.62271274) 宁波市科技项目(No.2024Z004,No.2023Z059)~~。

关键词线稿着色卷积神经网络 TRANSFORMER 颜色匹配动画制作 sketch coloring convolutional neural network Transformer color matching animation production

分类号 TP391.41 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1徐坤,任万凯,王晓夫,魏志民,潘作舟,刘征,蔡木霞.噪声背景下的MSECAE轴承故障诊断方法研究[J].机电工程技术,2024,53(7):29-33.
2彭浩康,葛芸,杨小雨,胡昌泉.基于Deformable Transformer和自适应检测头的遥感图像目标检测[J].激光与光电子学进展,2024,61(12):315-326.
3刘礼勋,张汉,彭心茹,窦沁榕,邬颖杰,郭炯,李富.基于Distance-2算法的并行Jacobian矩阵计算及其在耦合问题中的应用[J].原子能科学技术,2024,58(6):1201-1209.
4熊义刚,南岭,李铉,潘铁水.平台企业算法治理新格局研究[J].特区实践与理论,2024(3):36-41.
5易清明,王渝,石敏,骆爱文.联合多连接特征编解码与小波池化的轻量级语义分割[J].电子科技大学学报,2024,53(3):366-375.
6王元红,杨志明,王琪,卢劲竹,高俊锋.基于YOLOv5s-SPD的茶芽识别方法及识别系统光源设计与试验[J].智能化农业装备学报（中英文）,2024,5(3):33-43.
7苏凯第,赵巧娥.基于YOLOv5的绝缘子图像识别算法轻量化改进研究[J].电瓷避雷器,2024(4):173-180.
8刘江宁.绘画的形式与意义[J].美与时代（美术学刊）（中）,2024(8):12-14.
9张靖贤,郭传磊,周萌萌,杨杰.基于图Transformer的快速布局估计算法[J].青岛大学学报（工程技术版）,2024,39(2):24-31.
10王婧,张欣雨,邱添,孙玉发,丁艺,陈长洁,李志民.羊毛纤维低温美拉德环保染色工艺研究[J].毛纺科技,2024,52(8):16-20.

电子学报

2024年第7期

浏览历史

内容加载中请稍等...

Cross-CNN:基于CNN和Transformer混合模型的动画跨帧线稿着色算法

相关作者

相关机构

相关主题

浏览历史