深度学习视频超分辨率技术综述被引量：3

Deep learning based video-related super-resolution technique:a survey

导出

摘要视频超分辨率技术在卫星遥感侦测、视频监控和医疗影像等方面发挥着关键作用,在各领域具有广阔的应用前景,受到广泛关注,但传统的视频超分辨率算法具有一定局限性。随着深度学习技术的愈发成熟,基于深度神经网络的超分辨率算法在性能上取得了长足进步。充分融合视频时空信息可以快速高效地恢复真实且自然的纹理,视频超分辨率算法因其独特的优势成为一个研究热点。本文系统地对基于深度学习的视频超分辨率的研究进展进行详细综述,对基于深度学习的视频超分辨率技术的数据集和评价指标进行全面归纳,将现有视频超分辨率方法按研究思路分成两大类,即基于图像配准的视频超分辨率方法和非图像配准的视频超分辨率方法,并进一步立足于深度卷积神经网络的模型结构、模型优化历程和运动估计补偿的方法将视频超分辨率网络细分为10个子类,同时利用充足的实验数据对每种方法的核心思想以及网络结构的优缺点进行了对比分析。尽管视频超分辨率网络的重建效果在不断优化,模型参数量在逐渐降低,训练和推理速度在不断加快,然而已有的网络模型在性能上仍然存在提升的潜能。本文对基于深度学习的视频超分辨率技术存在的挑战和未来的发展前景进行了讨论。 Video-related super-resolution(VSR)technique can be focused on high-resolution video profiling and restoration to optimize its low-resolution version-derived quality.It has been developing intensively in relevant to such domains like satellite remote sensing detection,video surveillance,medical imaging,and low-involved electronics.To reconstruct high-resolution frames,conventional video-relevant super-resolution methods can be used to estimate potential motion status and blur kernel parameters,which are challenged for multiscene hetegerneity.Due to the quick response ability of fully integrating video spatio-temporal information of real and natural textures,the emerging deep learning based video superresolution algorithms have been developing dramatically.We review and analyze current situation of deep learning based video super-resolution systematically and literately.First,popular YCbCr datasets are introduced like YUV25,YUV21,ultra video group(UVG),and the RGB datasets are involved in as well,such as video 4(Vid4),realistic and dynamic scenes(REDS),Vimeo90K.The profile information of each dataset is summarized,including its name,year of publication,number of videos,frame number,and resolution.Furthermore,key parameters of the video super-resolution algorithm are introduced in detail in terms of peak signal-to-noise ratio(PSNR),structural similarity(SSIM),video quality model for variable frame delay(VQM_VFD),and learned perceptual image patch similarity(LPIPS).For the concept of video super-resolution and single image super-resolution,the difference between video super-resolution and single image super-resolution can be shown and the former one has richer video frames-interrelated motion information.If the video is processed frame by frame in terms of the single image super-resolution method,there would be a large number of artifacts in the reconstructed video.We carry out deep learning based video super-resolution methods analysis and it has two key technical challenges of those are image alignment and feature integration.For image alignment,its option of image alignment module is challenged for severe hetergeneity between video super-resolution methods.Image alignment and non-alignment methods are categorized.The integration of multi-frame information is based on the network structure like generative adversarial networks(GAN),recurrent convolutional neural networks(RNN),and Transformer.To process video feature and make neighboring frames align with the target frame,image-aligned methods can use different motion estimation and motion compensation module.Image alignment methods can be segmented into three alignment-related categories:optical flow,kernel,and convolution-deformable.This optical flow alignment method can be used to calculate the motion flows between two frames through their pixels-between gray changes in temporal and the neighboring frames are warped by motion compensation module.We divide them into four categories in terms of the optical flow alignment-relevant model structure of deep convolutional neural network(CNN)further:2D convolution,RNN,GAN,and Transformer.For optical flow-aligned 2D convolution methods analysis,we mainly introduce video efficient sub-pixel convolutional network(VESPCN)and its improvement on optical flow estimation network and motion compensation network,such as ToFlow and spatial-temporal transformer network(STTN).For the RNN methods with optical flow alignment,we analyze residual recurrent convolutional network(RRCN),recurrent back-projection network(RBPN)and other related methods using optical flow to align neighboring frames at the image level,which is required to resolve the constraints of the sliding window methods.Therefore,to obtain excellent reconstruction performance,we focus on BasicVSR(basic video super-resolution),IconVSR(information-refill mechanism and coupled propagation video super-resolution)and other networks,which can warp neighboring frames at the feature level.The optical flow alignment-based TecoGAN(temporal coherence via self-supervision for gan-based video generation)and VSR Transformer methods are introduced in detail as well.Due to a few kernel-based and deformable convolution-based align methods,it is still a challenging issue for classify network structure.Because convolution kernel size can used to limit the range of motion estimation,the reconstruction performance of the kernel-based alignment methods is relatively poor.Specifically,deformable convolution is a sampling improvement of conventional convolution,which still has some gaps to be bridged like high computational complexity and harsh convergence conditions.For non-alignment methods,multiple network structures are challenged for video frames-between correlation to a certain extent.We review and analyze the methods in related to non-aligned 3D convolution,non-aligned RNN,alignmentexcluded GAN,and non-local.The non-alignment RNN methods consist of recurrent latent space propagation(RLSP),recurrent residual network(RRN)and omniscient video super-resolution(OVSR)and it demonstrates that a balance can be achieved between reconstruction speed and visual quality.To reduce the computational cost,the improved non-local module is focused on when alignment-excluded non-local methods are introduced.All models are tested with 4×downsampling using two degradations like bicubic interpolation(BI)and blur downsampling(BD).The multiple datasets-based quantitative results,speed comparison of the super-resolution methods are summarized as well,including REDS4,UDM10,and Vid4.Some effects can be optimized.The reconstruction performances of these video-based super-resolution networks are balanced in consistency,the parameters of the model are gradually shrinked,and the speed of training and reasoning is accelerated as well.However,the application of deep learning in video super-resolution is still to be facilitated more.We predict that it is necessary to improve the adaptability of the network and validate the traced result.Current deep learning technologies can be introduced on the nine aspects as mentioned below:network training and optimization,ultrahigh resolution-oriented video super-resolution for,video-compressed super-resolution video-rescaling methods,selfsupervised video super-resolution,various-scaled video super-resolution,spatio-temporal video super-resolution,auxiliary task-guided video super-resolution,and scenario-customized video super-resolution.

作者江俊君程豪李震宇刘贤明王中元 Jiang Junjun;Cheng Hao;Li Zhenyu;Liu Xianming;Wang Zhongyuan(School of Computer Science and Technology,Harbin Institute of Technology,Harbin 150001,China;School of Computer,Wuhan University,Wuhan 430072,China)

机构地区哈尔滨工业大学计算机科学与技术学院武汉大学计算机学院

出处《中国图象图形学报》 CSCD 北大核心 2023年第7期1927-1964,共38页 Journal of Image and Graphics

基金国家自然科学基金项目(61971165,92270116,62071339)。

关键词深度学习视频超分辨率(VSR) 图像配准运动估计运动补偿 deep learning video super-resolution(VSR) image alignment motion estimation motion compensation

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献8

1程德强,郭昕,陈亮亮,寇旗旗,赵凯,高蕊.多通道递归残差网络的图像超分辨率重建[J].中国图象图形学报,2021,26(3):605-618. 被引量：27
2何小海,吴媛媛,陈为龙,卿粼波.视频超分辨率重建技术综述[J].信息与电子工程,2011,9(1):1-6. 被引量：9
3李书林,冯朝路,于鲲,刘鑫,江鑫,赵大哲.基于深度学习的心脏磁共振影像超分辨率前沿进展[J].中国图象图形学报,2022,27(3):704-721. 被引量：4
4吴洋,樊桂花.视频序列超分辨率重构技术综述[J].软件,2017,38(4):154-160. 被引量：5
5张岩,李建增,李德良,杜玉龙.无人机侦察视频超分辨率重建方法[J].中国图象图形学报,2016,21(7):967-976. 被引量：9
6张义轮,干宗良,朱秀昌.相似性约束的视频超分辨率重建[J].中国图象图形学报,2013,18(7):761-767. 被引量：3
7周波,李成华,陈伟.区域级通道注意力融合高频损失的图像超分辨率重建[J].中国图象图形学报,2021,26(12):2836-2847. 被引量：8
8周亮,朱秀昌.基于Bayesian理论的压缩视频超分辨率重构算法[J].中国图象图形学报,2006,11(5):730-735. 被引量：2

二级参考文献75

1张地,杜明辉.超分辨率图像重构边缘振荡的高效去除算法[J].信息与电子工程,2004,2(2):81-85. 被引量：4
2王新年,梁德群.一种新的面向超分辨率的图像配准方法[J].计算机工程与应用,2005,41(4):91-93. 被引量：3
3彭晓明,丁明跃,周成平,张天序.一种序列图像配准的计算框架[J].中国图象图形学报（A辑）,2005,10(4):441-449. 被引量：6
4Shechtman E,Caspi Y,Irani M.Space-time Super-resolution[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2005,27(4):531-544.
5WANG Z Z,QI F.Analysis of multiframe super-resolution reconstruction for image anti-aliasing and deblurring[J].Image and Vision Computing,2005,23(4):393-404.
6VAN Ouwerkerk J D.Image Super-resolution Survey[J].Image and Vision Computing,2006,24(10):1039-1052.
7Tsai R Y,Huang T S.Multi-frame image restoration and registration[J].Advances in Computer Vision and Image Processing,1984(1):317-339.
8Kim S P,Su W.Recursive high-resolution reconstruction of blurred multi-frame images[C] //ICASSP'91:International Conference on Acoustics,Speech,and Signal Processing.Washington,DC:IEEE Press,1991:2977-2980.
9Ur H,Gross D.Improved resolution form subpixel shifted pictures[J].CVGIP:Graphical Models and Image Processing,1992,54(2):181-186.
10RAJAN D,CHAUDHURI S.Generalized interpolation and its application in super-resolution imaging[J].Image and Vision Computing,2001,19(13):957-969.

共引文献57

1闫晓辉,瞿航,谢晓亮,郝慧婷,周怡,班淇琦,赵义,王苇.智能快速磁共振在颅内血管壁磁共振成像中的应用价值[J].数字医学与健康,2024,2(3):153-158.
2周芳,蒋建国,王培珍.一种改进的视频序列超分辨率重建算法及应用[J].工程图学学报,2011,32(1):45-51. 被引量：2
3桓宗圣,陶青川,田旺.一种基于分割的图像去雾新算法[J].太赫兹科学与电子信息学报,2013,11(2):254-259. 被引量：2
4刘丹霞,干宗良,杨文峰.基于相似性约束的人脸超分辨率重建算法[J].计算机技术与发展,2015,25(8):58-61.
5王帅,李迎春,张廷华.相机阵列中无视场重合图像时间配准方法研究[J].电子测量技术,2015,38(10):53-58. 被引量：5
6俞文静,张明军,王影.基于背景擦除的视频监控图像超分辨率重建[J].计算机与数字工程,2016,44(4):730-734. 被引量：4
7赵喆.多视点视频的超分辨率重建技术设计[J].数码世界,2017,0(1):16-17.
8闫友文.多媒体学习环境下视频图像信息特征挖掘仿真[J].计算机仿真,2017,34(4):208-211.
9陈诚,常侃,莫彩网,李天亦,覃团发.基于非局部均值和总变分最小化的单视频超分辨率算法[J].计算机科学,2018,45(3):263-267. 被引量：1
10李现国,孙叶美,杨彦利,苗长云.基于中间层监督卷积神经网络的图像超分辨率重建[J].中国图象图形学报,2018,23(7):984-993. 被引量：8

同被引文献23

1林琦,陈婧,曾焕强,朱建清,蔡灿辉.基于多尺度特征残差学习卷积神经网络的视频超分辨率方法[J].信号处理,2020,36(1):50-57. 被引量：9
2刘村,李元祥,周拥军,骆建华.基于卷积神经网络的视频图像超分辨率重建方法[J].计算机应用研究,2019,36(4):1256-1260. 被引量：36
3张宁,王永成,张欣,徐东东.基于深度学习的单幅图片超分辨率重构研究进展[J].自动化学报,2020,46(12):2479-2499. 被引量：11
4郑萌萌,钱慧芳,周璇.基于监控视频的Farneback光流算法的人体异常行为检测[J].国外电子测量技术,2021,40(3):16-22. 被引量：11
5陈贵强,何军,罗顺茺.基于改进CycleGAN的视频监控人脸超分辨率恢复算法[J].计算机应用研究,2021,38(10):3172-3176. 被引量：10
6卢正浩,刘丛.多尺度特征复用混合注意力网络的图像重建[J].中国图象图形学报,2021,26(11):2645-2658. 被引量：4
7周波,李成华,陈伟.区域级通道注意力融合高频损失的图像超分辨率重建[J].中国图象图形学报,2021,26(12):2836-2847. 被引量：8
8唐晓天,马骏,李峰,杨雪,梁亮.基于多尺度时域3D卷积的视频超分辨率重建[J].图学学报,2022,43(1):53-59. 被引量：3
9杨巨成,左美然,魏峰,孙笑,白亚欣,王嫄,陈亚瑞.基于误差反馈和面部后先验信息的人脸超分辨率重建[J].天津科技大学学报,2022,37(2):35-42. 被引量：1
10马龙,马腾宇,刘日升.低光照图像增强算法综述[J].中国图象图形学报,2022,27(5):1392-1409. 被引量：29

引证文献3

1陈亮,梁暄浩.基于校园监控的多帧图像超分辨率重建技术[J].沈阳理工大学学报,2024,43(4):7-12.
2唐晓天,刘潇.基于时域可变形卷积的视频超分辨率重建[J].信息化研究,2024,50(2):41-47.
3岳焕景,杨文瀚,李重仪,杨铀,刘文予,杨敬钰.像感域(Raw域)底层视觉重建技术进展[J].中国图象图形学报,2024,29(6):1646-1666.

1施倩,罗戎蕾.基于生成对抗网络的服装图像生成研究进展[J].现代纺织技术,2023,31(2):36-46. 被引量：10
2龚荣生,刘海繁,雷晴.以“视频+”构建全新传播体系[J].新闻战线,2023(13):12-16.
3殷明,倪永刚.税收治理视角下税收征管现代化的时代要求及路径取向[J].国际税收,2023(3):24-31. 被引量：15
4李记恒,褚霄杨,王潇,刘鹏宇,袁静.泵站断路器手车状态智能检测方法[J].信息技术与信息化,2023(7):213-216. 被引量：1
5马欣驰.基于内容数据库的融合视频包装系统的研究与设计[J].数字传媒研究,2023,40(6):8-12.
6刘慧,卢云志,张雷.基于Dropout改进的SRGAN网络DrSRGAN[J].科学技术与工程,2023,23(23):10015-10022. 被引量：3
7张永强.公路桥梁软土地基施工技术综述[J].河北水利,2023(8):45-46.
8贾永乐,周李涌,刘月峰,弓彦章.基于改进双流ResNet网络的人体行为识别算法研究[J].内蒙古科技大学学报,2023,42(2):145-148. 被引量：2
9投稿须知[J].智能建筑电气技术,2023,17(3).
10苏晓,王钺媛,万明习.超声微小血管成像杂波抑制技术综述[J].复旦学报（自然科学版）,2023,62(4):428-439. 被引量：3

中国图象图形学报

2023年第7期

浏览历史

内容加载中请稍等...

深度学习视频超分辨率技术综述被引量：3

参考文献8

二级参考文献75

共引文献57

同被引文献23

引证文献3

相关作者

相关机构

相关主题

浏览历史

深度学习视频超分辨率技术综述 被引量：3

参考文献8

二级参考文献75

共引文献57

同被引文献23

引证文献3

相关作者

相关机构

相关主题

浏览历史

深度学习视频超分辨率技术综述被引量：3